You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
/ ... /
lowestprime / BP-DNAm /
Clear Command Palette
Tip:
Type # to search pull requests
Type ? for help and tips
Tip:
Type # to search issues
Type ? for help and tips
Tip:
Type # to search discussions
Type ? for help and tips
Tip:
Type ! to search projects
Type ? for help and tips
Tip:
Type @ to search teams
Type ? for help and tips
Tip:
Type @ to search people and organizations
Type ? for help and tips
Tip:
Type > to activate command mode
Type ? for help and tips
Tip:
Go to your accessibility settings to change your keyboard shortcuts
Type ? for help and tips
Tip:
Type author:@me to search your content
Type ? for help and tips
Tip:
Type is:pr to filter to pull requests
Type ? for help and tips
Tip:
Type is:issue to filter to issues
Type ? for help and tips
Tip:
Type is:project to filter to projects
Type ? for help and tips
Tip:
Type is:open to filter to open content
Type ? for help and tips
We’ve encountered an error and some results aren't available at this time. Type a new search or try again later.
No results matched your search
Top result
Commands
Type > to filter
Global Commands
Type > to filter
This Page
Files
Pages
Access Policies
Organizations
Repositories
Issues, pull requests, and discussions
Type # to filter
Teams
Users
Projects
Projects (classic)
Modes
Use filters in issues, pull requests, discussions, and projects
Search for issues and pull requests#Search for issues, pull requests, discussions, and projects#Search for organizations, repositories, and users@Search for projects!Search for files/Activate command mode>Search your issues, pull requests, and discussions# author:@meSearch your issues, pull requests, and discussions# author:@meFilter to pull requests# is:prFilter to issues# is:issueFilter to discussions# is:discussionFilter to projects# is:projectFilter to open issues, pull requests, and discussions# is:open
This project investigates accelerated biological aging in the largest bipolar disorder DNA methylation cohort to date, aiming to identify epigenetic age acceleration differences, drivers, and modifiers between individuals with bipolar disorder and controls. Preprocessing and quality control of DNA methylation data from Illumina EPIC arrays is performed, specifically addressing missing probes and data normalization. GrimAge2 and other epigenetic aging algorithms from the pyaging Python package are applied. Statistical analyses, including t-tests, ANCOVA, and correlation analysis, are conducted in R and Python to assess differences in GrimAge2 age acceleration between diagnostic groups while covarying for age and sex. Data visualization is employed using Python libraries including seaborn and matplotlib to generate informative plots for data exploration and presentation. The R packages minfi, BioAge, dnaMethyAge, and methylclock are applied to prepare for epigenetic clock analysis. Finally, data wrangling and manipulation is performed using R's data.table and Python's pandas to prepare, clean, and transform the raw data for analysis. Future research will compare across multiple methylation aging clocks, characterize the individual contributions of GrimAge2 subcomponents, and explore the effects of lithium treatment and other environmental modifiers on epigenetic age acceleration in bipolar disorder.
High-Performance Computing (HPC): Conducted in the Hoffman2 HPC environment utilizing SGE job scheduling and parallel processing in R for computationally intensive tasks.
Data Management: Data cleaning, transformation, merging, and subsetting across both R and Python is performed. Efficiently procssed large datasets using packages including bigmemory and pyarrow. Generated reproducible analysis workflows by logging key data characteristics (e.g. data dimensions, timestamps) to filenames.
Statistical Analysis: Conducted various statistical analyses, including descriptive statistics, correlation analysis, t-tests, ANCOVA, and planned for GAMs.
Data Visualization: Created a wide range of static visualizations for exploratory data analysis and presentation of results.
Version Control: Utilized GitHub for code sharing and version control.
Workflow Design: Designed and implemented a multi-stage analysis pipeline involving data preprocessing, clock calculation, statistical analysis, visualization, and reporting, including integration of R and Python components.
Data Acquisition: Acquired raw DNA methylation data (likely IDAT files) from Illumina EPIC arrays along with accompanying sample sheets containing demographic and diagnostic information. Potentially integrated data from multiple sources (e.g., "Bipolar 2023 Sample Sheet", "2000_sample_covariates", "highcov_technical_covariates", "Complete BIG Data").
Data Import and Formatting: Imported data into R and converted to appropriate formats (e.g., GenomicRatioSet) for downstream analysis using minfi. Used R's read.csv, read_excel, and read.table for sample sheet information. Employed Python's pyarrow.feather for efficient loading of preprocessed and saved data subsets.
Data Cleaning and Quality Control (QC): Performed quality control procedures, including:
Checking for missing data in both methylation and sample annotation data.
Addressing missing probe information using external resources like the mepylome package and manifest files.
Removal of duplicate probe data.
Compared predicted and reported sex.
Data Wrangling and Transformation: Manipulated and transformed data using dplyr, tidyr, data.table in R and pandas in Python. This included renaming columns, recoding variables (e.g., Gender), handling "_REP" sample duplicates, merging datasets, calculating age in months/years from date data, and summarizing missing data patterns.
Data Subsetting: Created subsets of data for specific analyses (e.g., selecting samples with complete data, extracting specific CpG sites related to GrimAge2).
GrimAge2 Calculation: Calculated GrimAge2 and AgeAccelGrim2 using custom R functions leveraging bigmemory for efficient handling of large matrices and doParallel for parallel processing of subcomponents. This included loading pre-trained GrimAge2 model weights and reference values.
Other Clock Calculations: Calculated various epigenetic clocks using R packages (DNAmAge, DunedinPoAm, DunedinPACE) and Python package (pyaging). This required handling missing CpG sites for each clock and managing compatibility between R and Python data structures.
Probe Analysis and Verification: Compared the CpG sites required by GrimAge2 with the available CpG sites in the methylation data and reference array annotations (IlluminaHumanMethylationEPICv2anno.20a1.hg38). Identified and documented missing probes.
Descriptive Statistics: Computed descriptive statistics (e.g., mean, standard deviation, median, quartiles) for age, GrimAge2, and AgeAccelGrim2, stratified by diagnosis, using data.table and pandas.
Correlation Analysis: Calculated Pearson, Spearman, and Kendall correlations between chronological age and GrimAge2 using R's stats package.
Comparative Analysis: Performed t-tests and ANCOVA to compare AgeAccelGrim2 between bipolar and control groups, considering age as a covariate, using R's stats and statsmodels packages in Python.
Data Visualization: Generated various plots, including density plots, box plots, violin plots, scatter plots, bar plots, and pie charts, to visualize data distributions, correlations, and group differences using ggplot2, plotly in R and seaborn, matplotlib in Python. This involved customizing plot aesthetics, adding statistical annotations (p-values, effect sizes), and creating multi-panel figures.
Data Export and Reporting: Exported results and summary tables to CSV and Excel files using R's fwrite and Python's pandas.to_csv for reporting and sharing.