Unsplashed background img
Research

The Computational Proteomics Group develops and applies computational solutions for improved data analysis in large-scale omics experiments with focus on proteins and their post-translational modifications (PTMs). The aim is to better understand the functional protein states in order to determine, confirm and predict their contribution to cell behavior and disease.
Our main research interests are

Software and workflow development for data from protein mass spectrometry experiments

Chromatin biology: regulatory control by histone modifictions

Tools for quantification and interpretation of omics data

Modelling and interpretation of signalling in molecular pathways
For the full publication record, see Google Scholar.

Your browser cannot display svg
Software development
Proteomics informatics
  • R scripts, Shiny apps and Jupyter notebooks for functional analysis of omics data

  • New software solutions for middle-down and top-down mass spectrometry

  • Deep learning applications for improved feature detection

  • Portable software pipelines and automatic workflow composition

  • Co-expression of proteins and protein complexes

  • Determination and visualization of PTM behavior and their crosstalk

Databases

Visit CrosstalkDB hosting quantitative data for crosstalk between histone proteins measured by middle-down mass spectrometry.
The Elixir Tools and Data Service Registry, bio.tools, launched in January 2015, hosts details about tens of thousands of databases and bioinformatics tools.
Our new MS2AI tool allows creating and categorizing millions of mass spectra.

Shiny apps

The Shiny framework facilitates embedding R-scripts into simple GUIs. We develop web applications for interactive data analysis (see Shiny Apps)
Try our tools for variance-sensitive clustering of large data sets Launch VSClust,
combined statistical testing Launch PolySTest,
check out our tools to characterize protein complexes based on your data Launch ComplexBrowser or generally in human cells Launch CoExpresso,
and try our new tool for protein inference and summarization Launch VIQoR.
These apps are also available as stand-alone version via docker and conda, as well as via command-line for integration into workflows.

Unsplashed background img
Quantitative analysis

Chromatin Biology
  • Middle-down mass spectrometry We develop and apply a workflow to quantify PTMs on histone tails (refs).

  • CrosstalkDB With quantitative data from middle-down and top-down mass spectrometry, the web server collects and analyzes the input files, followed by statistical assessment of the crosstalk between measured PTMs (refs).
    Access CrosstalkDB
    your browser cannot display svg-files

  • Computational models Taking simple rules for writing, propagating and deleting histone PTMs on chromatin, we were able to reproduce global patterns measured by ChIP-seq experiments. The implementation of crosstalk rules results in a rich spatial and temporal behavior (ref).
    your browser cannot display svg-files

  • Coming soon Automatized software pipeline for the analysis more middle-down MS data and updated version of CrosstalkDB with API and automatic upload.

Deep learning in proteomics
  • Peptide representations for deep learning In collaboration with the Röttger group, we developed an environment to retrieve millions of mass spectra (MS1 and MS2) from the public repository PRIDE, to categorize them in a database, and to create data representations that can be directly used for machine learning purposes: MS2AI (refs).

  • Coming soon Investigation of the bias and balance in deep-learning assisted peptide identification.

  • See also the AIMe registry to report AI-based biomedical results in a standardized and reproducible manner (ref).

Workflows and standards

no image
Complex Systems
  • Biological evolution See my former studies of aging, sympatric speciation and competitive cellular automata (refs).

  • Simulations Almost anything can be simulated on the computer including sand dunes, opinion dynamics and linguistics (refs).

  • Statistical Mechanics See my work on generalized entropies and Fokker-Planck equations (refs).

no image
Unsplashed background img

Old course on quantitative data analysis of proteomics data

This presents presents an overview of the main methods for analysis of data from peptide mass spectrometry and other -omics data.


Click on the picture for course material.

Data analysis of proteomics data


Courses

My group is running two Master's courses for Biostatistics in R (BMB830 and BMB831) at the Department of Biochemistry and Molecular Biology, a PhD course for Workshops in Applied Bioinformatics (BMB209), co-teaches the bachelor course Applications of mathematics in life sciences (BMB539), co-teaches the bachelor course Bioinformatics I (BMB511), and Biostatistics and Experimental design as part of the Master's programme Life Science Engineering and Informatics of the Sino-Danish University


Click on the picture for more information about Biostatistics in R.

Biostatistics in R

Biostatistics in R I

Lecture I: Introduction
Corresponding R script
Lecture II: Basics
Corresponding R script
Lecture III: Arrays, matrices and data frames
Corresponding R script
Lecture IV: Data manipulation
Corresponding R script
Lecture V: Visual methods
Corresponding R script
Lecture VI: Basic statistics
Corresponding R script
Lecture VII: Data modeling
Corresponding R script
Lecture VIII: Statistical tests
Corresponding R script
Lecture IX: Multi-variate analysis
Corresponding R script
Lecture X: Interactivity in R
Corresponding R script
Exercises

Biostatistics in R II

Course description: Lecture I: Data analysis of omics data: general aspects
Lecture II: Proteomics data
Lecture III: Transcriptomics data
Lecture IV: Epigenomics data
Lecture V: Metabolomics data
Lecture VI: Data interpretation

Projects

We offer projects for Bachelor and Master students. To get an idea, please talk a look at our research. If you are interested, please contact me.


Click on the picture for a list of old projects.

Former or current student projects

First year bachelor projects

Functional analysis of a fish oil diet

How strong are our muscles?

Bachelor projects

Investigation of PTM cross talk in mice to resolve age- and tissue dependent patterns.

Large-scale investigation of protein variance in cancer tissues

Enhanced and animated visualization of temporal changes on the histone PTM landscape

Optimization of the data analysis pipeline to characterize combinatorial post-translational mod- i cation (PTMs) of histones

A proteomics analysis of protein abundance variations in cancer

Optimization of multi-threading capabilities in data clustering approaches

Master projects (individual study projects and full theses

Determination of cellular age as a method to assess aging effects in multicellular organisms

Bioinformatics in proteomics - supervised data analysis focused on protein complexes.

Intrinsically Disordered Protein Domains and Post-Translational Modifications - A Computational Biology Study

Wed-based application for visualization of proteins and their post translational modifications (PTMs) based on their quantification.

A fully reproducible and user-friendly workflow for the analysis of PTM-omics data

Implementation and optimization of a fully automatized pipeline for the analysis of middle-down MS data

Tandem-Mass spectrometry prediction based on Liquid Chromatography-Mass Spectrometry chromatogram using deep neural networks

Implementation of fully reproducible and scalable data analysis work ows in bioinformatics using Next ow

Computational proteomics analysis of histones and their post-translational modifications

Development of a statistical workflow to determine the relative post-translationally modification changes.

Unsplashed background img
Availability

All apps are accessible as web services on this server. This means that they might be temporarily inaccessble due to too high usage. In this case, please try later or run the app(s) locally. For local implementations, we provide docker containers (usually "veitveit/app_name_in_lower_case"), access through the SDU Cloud (will only work when you are related to a Danish institution) or as conda packages (only PolySTest and VSClust).

VSClust: Variance-sensitive clustering

Improved clustering of any quantitative data, statistical testing and pathway analysis. Find the source code here.

PolySTest: Detection of differentially regulated features

Combined statistical testing for data with few replicates and missing values. Find the source code here.

QC and quantification of protein complexes

Carry out quality control of quantification in your dataset and investigate the behavior of protein complexes. Find the source code here.

Co-regulation of protein groups in human cells

Investigate the quantitiative behavior of protein complexes and arbitrary protein groups in human cells based on the data from ProteomicsDB. Find the source code here.

Interactive tool for protein inference, summarization, and visualization

Run different ways of (parsimonious) protein inference and optimized summarization based on factor analysis. The results can be extensively assessed both in numbers and visually. Find the source code here.

Unsplashed background img

Elixir
(European Infrastructure for Biological Information)

The Danish node of the Elixir consortium implements and maintains the registry of software tools in life science (ref). The registry adapts rich annotations by basing software descriptions on the EDAM ontologies. I am involved in several projects aiming to improve and extend curation of software tools, the proteomics use case, automatic synthesis of workflows from the registry content, as well as maintenance of the EDAM ontology.

no image

EuBIC
(European Bioinformatics Initiative)

Initiative of bioinformaticians in Europe to improve support and coordination of training and software development in proteomics informatics.
Conference: We are organizing conferences, hackathons and workshops in Computational Proteomics.

ga('create', 'UA-54594747-2', 'auto'); ga('send', 'pageview');