The Bioinformatics and Biostatistics Core at Joslin Diabetes Center, affiliate of Harvard Medical School, is composed of the Director Dr. Jonathan Dreyuss and the senior bioinformatician & biostatistician Dr. Hui Pan, who both hold a PhD and have at least a decade of experience in bioinformatics. The Core has access to Joslin Diabetes Center’s own Linux-based high-performance computing cluster, managed by Joslin Diabetes Center Research Computing, which has 1.44 TB of RAM and 325 TB of storage and is loaded with commonly used software, such as Python and R.
The Core offers support for data-driven projects related to basic, clinical, and translational research, with an emphasis on diabetes. The Core aims to ensure that researchers take advantage of the most modern and robust methods available in the field of bioinformatics and biostatistics, including for new data types and novel analysis requests.
Free initial brief consultations (~ 1 hour) are available to design experiments and discuss analysis strategies. These can be requested either through email. Free consultations that require slightly more time or limited data analysis, such as power calculations, can be requested by Harvard Medical School labs at Harvard Catalyst biostatistics consulting by pressing on the Request Consultation button, and then in the form checking the Biostatistics Consultation box (NOT the Bioinformatics Consultation box). Biostatistics consultations from Joslin flow automatically to Jonathan Dreyfuss.
For omic and non-omic data, we offer sample size, power calculations, experimental design recommendations and multiple analysis approaches such as ANOVA, ANCOVA, repeated measures ANOVA, handling of missing values, nonparametric analyses, causal inference such as mediation analysis, and machine learning.
We offer omic analysis of all data types, including next-generation sequencing (of bulk tissue, single cells, single nuclei, or spatial), mass spectrometry, microarrays, and qPCR. We also meta-analyze multiple experiments and integrate data types, construct and analyze networks such as gene networks, and apply metabolic flux analysis such as inferring fluxes from Seahorse flux analyzer data.
The typical bioinformatics pipeline includes normalization, quality control, Principal Component Analysis (PCA), differential abundance, pathway analysis, and visualization. This pipeline takes about 10 hours for most data types, such as bulk RNA-seq, proteomics (inc. SomaLogic), metabolomics, phosphoproteomics. However, it takes about half as much time for a normalized table of counts from high quality samples, whereas it takes about twice as much time (about 20 hours) for a dataset of raw scRNA-seq data. For 10x Genomics data, we can process it so that it can be viewed in 10x Genomics Loupe browser for about 5 hours. These prices are mostly independent of the number of samples. Post-pipeline requests often involve accounting for sample quality, new comparisons, subgroup analyses, and additional visualizations.
If we assist you in manuscripts, please acknowledge the help of the Core. We do not expect to become authors, and becoming an author does not negate service charges. Our bioinformatics and biostatistics services cost $110/hour for Joslin investigators and $160/hour for all others. A common turn-around time is 1-2 weeks, but when possible urgent requests (e.g. for grant deadlines) are accommodated.
We can analyze and integrate public data sets, which you can cite and include in your manuscript. One helpful resource to find public data of all types is Omics Discovery Index, which includes data from the NIH NCBI Gene Expression Omnibus (GEO). Using GEO Profiles, you can search for the expression of a gene across GEO's curated data sets.
Free, internal services
We offer seminars that teach the free R language and environment for biostatistics and bioinformatics based on a public interactive website we built at https://jdreyf.shinyapps.io/zero2bioinfo-interactively.
Only for Joslin researchers, we maintain an in-house gene expression database where a user can search for a gene and see its expression across approximately 75 studies. The Joslin intranet has instructions for logging into the database. We show a snapshot below, where fold-change (FC), p-value (P), and Benjamini-Hochberg (BH) false discovery rate per comparison are shown below the graphs.
Joslin Diabetes Center |
Name | Role | Phone | Location | |
---|---|---|---|---|
Jonathan Dreyfuss, PhD. |
Director
|
Jonathan.Dreyfuss@joslin.harvard.edu
|
||
Hui Pan, PhD. |
Senior Bioinformatician III
|
Hui.Pan@joslin.harvard.edu
|