ASTROREPOMICS

A Space-Integrated Transcriptomic Database for Reproductive Studies

Select criteria for gene filter










Welcome to Astrorepomics: Space-Relevant Reproductive Transcriptomics


Astrorepomics is a specialized expression data analysis platform tailored specifically for investigating gene expression related to reproductive biology under space-relevant conditions, such as microgravity and radiation exposure.

This platform provides intuitive tools for performing differential expression analysis, cross-species meta-analyses, and creating insightful visualizations including interactive volcano plots, heatmaps, and correlation matrices.

To maximize your experience and accuracy of results, please carefully follow the comprehensive workflow below:

  1. Load, inspect, and validate your metadata thoroughly in the 'Metadata' tab.
  2. Generate and review your normalized expression matrix in the 'Expression Matrix' tab.
  3. Perform robust differential expression analysis using the tools provided in the 'DEG' tab.
  4. Explore and interpret key differential expression visualizations — including interactive volcano plots, cross-species overlap analyses, and hierarchical heatmaps — all conveniently organized under the 'Differential Expression Insights' tab.
  5. Run pathway and gene set enrichment analyses in the “Pathway Enrichment” tab. Explore results across multiple databases — including KEGG, Panther, WikiPathways, Reactome, and MSigDB Hallmark — to uncover biologically meaningful insights into your data.

For detailed assistance, interactive tutorials are available within each tab to guide you through every step of the analysis, from initial data input to final visualization.


Important Note: Ensure your data are properly merged, normalized, and validated before performing downstream analyses to guarantee meaningful biological interpretations.

For interoperability and further downstream analysis, results can be seamlessly exported to platforms such as KEGG and MSigDB for pathway enrichment and gene set exploration.

Step 1: Upload and Inspect Your Metadata

Begin your analysis by reviewing the metadata linked to your studies. This includes information about the experimental design, species, and conditions (e.g., microgravity, radiation). Ensuring that your metadata is accurate and complete is essential for meaningful downstream analyses.

Download Data

Literature

Below is a summary of relevant studies included in the database. You can click on a dataset name to view its source link. Use the checkboxes in the first column to select one or more studies. Selected studies determine which samples will be displayed in the table below.


Included Samples

This table lists individual samples associated with the selected studies above, including exposure conditions and tissue or cell type. Use the checkboxes to choose which specific samples to include in the downstream merging step.

Step 2: Generate and Refine Your Expression Matrix

This section lets you (i) merge raw expression tables from multiple studies, (ii) optionally correct technical batch effects, and (iii) download a ready-to-analyse matrix. All steps are interactive and can be repeated until you are satisfied with the result.

Step 2.1: Merge Raw Expression Tables

Press “Start Merging” to pull the selected samples from the database and unite them by your chosen gene identifier. No scaling or batch correction is applied at this stage.


Step 2.2: Optional Batch-Effect Correction

Why? When your analysis includes data from multiple studies — such as different sequencing platforms, library preparations, or space-flight missions — technical variation (known as ‘batch effect’) can overshadow true biological signals. This step allows you to test and apply statistical correction methods to reduce such noise. Note: Batch-effect correction is only applicable when multiple datasets are present.


PCA · Before Correction

Samples are coloured by the first selected batch variable (or by the auto-detected one).

PCA · After Best Correction

The highest-ranked correction (based on quantitative scores) is displayed here.


How Did Each Method Perform?

The table ranks every ‘method × batch variable’ combination using: BatchQC skew/kurtosis, DSC variance ratio, log2-fold-change correlation, and DEG preservation. A higher composite score indicates better removal of batch while retaining biology.


Per-Gene Summary Statistics (pre-correction)

Mean and standard deviation of each gene within every dataset. These statistics feed the z-score option if you choose it as a correction method.


The matrix below reflects the last action you performed: either the raw merge or the best batch-corrected result. Review gene IDs and sample columns before moving to Step 3 (DE analysis).

Step 3: Differential Expression Analysis

This tab calculates log2 fold changes (Log2FC) and p-values across your selected conditions. Adjust the parameters (e.g., pseudocount, test type) to tailor the analysis for your study design.


Once differential expression analysis is completed, the table below will display computed Log2FC and p-values for each gene. Use these results to identify significantly altered genes under your specified conditions. If no results are showing, please ensure you've run the analysis using the button above.

Step 4: Integrative Visualization of Differential Gene Expression Across Species and Conditions

Cross-species comparison is only available when more than one species is detected in the merged data.

Visualize Differential Expression via Volcano Plot

A volcano plot helps you quickly spot genes that are significantly up- or down-regulated. Points far to the left or right represent larger fold changes, while points higher on the plot indicate stronger statistical significance (e.g., lower p-values).

After running differential expression analysis, this section will display a summary table followed by an interactive volcano plot. Hover over points in the plot to explore gene-level information. If nothing appears, please complete the prior steps to generate results.

Hierarchical Clustering and Heatmap Visualization

Heatmaps allow for quick pattern detection across multiple samples and genes. Hierarchical clustering can reveal groups of samples or genes with similar expression profiles.

Step 5: Pathway Enrichment

This module supports two enrichment methods using WebGestaltR: Gene Set Enrichment Analysis (GSEA) and Over-Representation Analysis (ORA) . In GSEA mode, genes are ranked by their log2 fold change values to identify enriched gene sets. In ORA mode, genes are filtered based on fold change and p-value thresholds to identify over-represented pathways.



Advanced Sensitivity Settings

Modify these parameters to adjust the sensitivity and stringency of pathway enrichment analyses. Typically, a minimum gene set size of 10–15 and a broader range up to 500–1000 are recommended. Lower minNum and higher perNum increase detection sensitivity but may also introduce more noise.



Download Full WebGestaltR Report

Download Full Report (ZIP)

Pre-Analysis Summary (ID Mapping + GO Slim)



How to interpret the mapping table:

This table shows how your submitted gene IDs were mapped to Entrez IDs and gene symbols. Only successfully mapped genes are retained for the enrichment analysis. The hyperlinks allow quick lookup in external NCBI gene databases.



How to interpret the GO Slim summary:

This barplot summarizes how your genes distribute across high-level Gene Ontology (GO Slim) terms. It gives a broad biological context before detailed pathway enrichment. Use the dropdown to explore Biological Process (BP), Molecular Function (MF), or Cellular Component (CC).


Enrichment Results