Veit Schwämmle
Associate professor
Email
Group page
Protein
Research Group
Department of Biochemistry and Molecular
Biology
University of Southern
Denmark
The Computational Proteomics
Group develops and applies computational solutions
for improved data analysis in large-scale omics
experiments with focus on proteins and their
post-translational modifications (PTMs). The aim is
to better understand the functional protein states
in order to determine, confirm and predict their
contribution to cell behavior and disease.
Our main research interests are
Software and workflow development for
data from protein mass spectrometry
experiments
Chromatin biology: regulatory control by
histone modifictions
Tools for quantification and
interpretation of omics data
Modelling and interpretation of
signalling in molecular pathways
For the full publication record, see
Google Scholar.
R scripts, Shiny apps and Jupyter notebooks for functional analysis of omics data
New software solutions for middle-down and top-down mass spectrometry
Deep learning applications for improved feature detection
Portable software pipelines and automatic workflow composition
Co-expression of proteins and protein complexes
Determination and visualization of PTM behavior and their crosstalk
Visit CrosstalkDB
hosting quantitative data for crosstalk between
histone proteins measured by middle-down mass
spectrometry.
The Elixir Tools and
Data Service Registry, bio.tools, launched in
January 2015, hosts details about tens of thousands
of databases and bioinformatics tools.
Our new MS2AI
tool allows creating and categorizing millions
of mass spectra.
The Shiny
framework facilitates embedding R-scripts into
simple GUIs. We develop web applications for
interactive data analysis (see
Shiny Apps)
Try our tools for variance-sensitive clustering of
large data sets Launch
VSClust,
combined statistical testing Launch
PolySTest,
check out our tools to characterize protein
complexes based on your data
Launch ComplexBrowser or generally in human
cells
Launch CoExpresso,
and try our new tool for protein inference and
summarization Launch VIQoR.
These apps are also available as stand-alone
version via docker and conda, as well as via
command-line for integration into workflows.
Review and Tutorials For an overview of methods used in PTMomics see our review. We published tutorials for CrossTalkMapper, VSClust + complex analysis and general PTM analysis, and have further material on our Gulbenkian course Integrative Biological Interpretation using Proteomics.
Characterization of PTM crosstalk We applied large-scale estimation of crosstalk between nearby residues (ref. and ref.). Interplay scores provide quantitative information about negative and positive crosstalk between PTMs on a protein ref. and ref.. Complex patterns of PTM crosstalk can be visualized by Crosstalk Maps with our new R scripts (ref).
Statistical tests The complexity of
high-throughput mass spectrometry experiments
often leads to low replicates numbers and many
missing values. We implemented a new test to
simultaneously consider missing values and
quantitative changes, which we combined with
well-performing statistical tests for high
confidence detection of differentially
regulated features ref.
(source
code and installation).
Launch PolySTest
The old LimmaRP app ( ref.)
is still accessible here:
Launch LimmaRP
Data clustering After providing a simple way to estimate appropriate parameter values for fuzzy c-means clustering (ref). (old app: Launch FuzzyClust), find the new tool VSClust that additionally considers feature variance leading to more accurate clustering results (ref. and tutorial). Launch VSClust (source code and installation)
Protein Complexes We developed tools to quantify protein complexes and to quantitatively characterize the behavior of their subunits. Launch ComplexBrowser to investigate the behavior of protein complexes in your proteomics data set (ref , source code and installation). Launch CoExpresso to look for co-expression pattern within up to 150 human cell types (ref, source code and installation).
Middle-down mass spectrometry We develop and apply a workflow to quantify PTMs on histone tails (refs).
CrosstalkDB With quantitative data
from middle-down and top-down mass
spectrometry, the web server collects and
analyzes the input files, followed by
statistical assessment of the crosstalk between
measured PTMs
(refs).
Access CrosstalkDB
Computational models Taking simple
rules for writing, propagating and deleting
histone PTMs on chromatin, we were able to
reproduce global patterns measured by ChIP-seq
experiments. The implementation of crosstalk
rules results in a rich spatial and temporal
behavior (ref).
Coming soon Automatized software pipeline for the analysis more middle-down MS data and updated version of CrosstalkDB with API and automatic upload.
Peptide representations for deep learning In collaboration with the Röttger group, we developed an environment to retrieve millions of mass spectra (MS1 and MS2) from the public repository PRIDE, to categorize them in a database, and to create data representations that can be directly used for machine learning purposes: MS2AI (refs).
Coming soon Investigation of the bias and balance in deep-learning assisted peptide identification.
See also the AIMe registry to report AI-based biomedical results in a standardized and reproducible manner (ref).
Standardization and community efforts
See the proposed notation of proteoforms:
ProForma (ref).
As part of ELIXIR
DK, we annotate proteomics tools for
bio.tools
(refs.)) and are part of the proteomics
community (see also white paper, ref).
WOMBAT-P We implemented different scalable and portable workflows for the analysis or label-free data as part of an ELXIR implementation study. They allow to systematically compare the performance of different data analysis workflows.
ProtProtocols As part of a project within the EuBIC initiative, we developed a framework for fully reproducible, documented and user-friendly pipelines for specific cases of proteomics data analysis. Within this framework, we just finished IsoProt, a full data analysis pipeline for iTRAQ/TMT data (ref). Check it out here: IsoProt at GitHub download it via our docker-launcher
Biological evolution See my former studies of aging, sympatric speciation and competitive cellular automata (refs).
Simulations Almost anything can be simulated on the computer including sand dunes, opinion dynamics and linguistics (refs).
Statistical Mechanics See my work on generalized entropies and Fokker-Planck equations (refs).
Old course on quantitative data analysis of proteomics data
This presents presents an overview of the main methods for analysis of data from peptide mass spectrometry and other -omics data.
Click on the picture for course material.
Courses
My group is running two Master's courses for Biostatistics in R (BMB830 and BMB831) at the Department of Biochemistry and Molecular Biology, a PhD course for Workshops in Applied Bioinformatics (BMB209), co-teaches the bachelor course Applications of mathematics in life sciences (BMB539), co-teaches the bachelor course Bioinformatics I (BMB511), and Biostatistics and Experimental design as part of the Master's programme Life Science Engineering and Informatics of the Sino-Danish University
Click on the picture for more information about Biostatistics in R.
Biostatistics in R
Lecture I: Introduction
Corresponding
R script
Lecture II: Basics
Corresponding R
script
Lecture III: Arrays, matrices and data frames
Corresponding R
script
Lecture IV: Data manipulation
Corresponding R
script
Lecture V: Visual methods
Corresponding
R script
Lecture VI: Basic statistics
Corresponding
R script
Lecture VII: Data modeling
Corresponding R
script
Lecture VIII: Statistical tests
Corresponding R
script
Lecture IX: Multi-variate analysis
Corresponding R
script
Lecture X: Interactivity in R
Corresponding R
script
Exercises
Course description: Lecture I: Data
analysis of omics data: general aspects
Lecture II: Proteomics data
Lecture III: Transcriptomics data
Lecture IV: Epigenomics data
Lecture V: Metabolomics data
Lecture VI: Data interpretation
Projects
We offer projects for Bachelor and Master students. To get an idea, please talk a look at our research. If you are interested, please contact me.
Click on the picture for a list of old projects.
Former or current student projects
First year bachelor projects
Functional analysis of a fish oil diet
How strong are our muscles?
Bachelor projects
Investigation of PTM cross talk in mice to resolve age- and tissue dependent patterns.
Large-scale investigation of protein variance in cancer tissues
Enhanced and animated visualization of temporal changes on the histone PTM landscape
Optimization of the data analysis pipeline to characterize combinatorial post-translational mod- i cation (PTMs) of histones
A proteomics analysis of protein abundance variations in cancer
Optimization of multi-threading capabilities in data clustering approaches
Master projects (individual study projects and full theses
Determination of cellular age as a method to assess aging effects in multicellular organisms
Bioinformatics in proteomics - supervised data analysis focused on protein complexes.
Intrinsically Disordered Protein Domains and Post-Translational Modifications - A Computational Biology Study
Wed-based application for visualization of proteins and their post translational modifications (PTMs) based on their quantification.
A fully reproducible and user-friendly workflow for the analysis of PTM-omics data
Implementation and optimization of a fully automatized pipeline for the analysis of middle-down MS data
Tandem-Mass spectrometry prediction based on Liquid Chromatography-Mass Spectrometry chromatogram using deep neural networks
Implementation of fully reproducible and scalable data analysis work ows in bioinformatics using Next ow
Computational proteomics analysis of histones and their post-translational modifications
Development of a statistical workflow to determine the relative post-translationally modification changes.
All apps are accessible as web services on this server. This means that they might be temporarily inaccessble due to too high usage. In this case, please try later or run the app(s) locally. For local implementations, we provide docker containers (usually "veitveit/app_name_in_lower_case"), access through the SDU Cloud (will only work when you are related to a Danish institution) or as conda packages (only PolySTest and VSClust).
Investigate the quantitiative behavior of protein complexes and arbitrary protein groups in human cells based on the data from ProteomicsDB. Find the source code here.
Elixir
(European Infrastructure for Biological
Information)
The Danish node
of the Elixir
consortium implements and maintains the registry of
software tools in life science (ref). The registry adapts rich
annotations by basing software descriptions on the
EDAM
ontologies. I am involved in several projects
aiming to improve and extend curation of software
tools, the proteomics use case, automatic synthesis
of workflows from the registry content, as well as
maintenance of the EDAM ontology.
EuBIC
(European Bioinformatics Initiative)
Initiative of bioinformaticians in Europe to
improve support and coordination of training and
software development in proteomics informatics.
Conference: We are organizing conferences,
hackathons and workshops in Computational
Proteomics.