Output files
This page is dedicated to output files created by ODAMNet, grouped by:
Query result files
Overlap analysis result files
Active module identification (AMI) result files
Random walk with restart (RWR) result files
Query output files
By default, ODAMNet retrieved input data from databases using queries. Chemical target genes are retrieved from the Comparative Toxicogenomics Database [1] (CTD). Rare disease pathways are retrieved from WikiPathways [2]. And biological networks are also downloaded from the Network Data Exchange [3].
It implies creation of output files that contain results of these queries.
CTD query output files
Chemical target genes are retrieved from CTD. Two files are created:
This file contains the raw query results.
This file contains the filtered query results. You can filter raw query results according the number of
publication associated to a chemical - gene association (--nbPub parameter). This file is used for the
analysis.
These two files have the same format:
Input: chemical query name (from the chemical file)ChemicalName: name of the query input or its descendant chemicalsChemicalId: MeSH ID of the query or its descendant chemicalsCasRN: CasRN ID of the query or its descendant chemicalsGeneSymbol: target gene name that is connected to the query or its descendant chemicalsGeneId: target gene ID (HGCN)Organism: organism name where comes from the target geneOrganismId: organism IDPubMedIds: PubMed IDs of the publication associated to this connection
This is an example of this file:
Input ChemicalName ChemicalId CasRN GeneSymbol GeneId Organism OrganismId PubMedIds
d014801 Tretinoin D014212 302-79-4 ZXDC 79364 Homo sapiens 9606 33167477
d014801 Tretinoin D014212 302-79-4 ZYG11A 440590 Homo sapiens 9606 23724009|33167477
d014801 Tretinoin D014212 302-79-4 ZYX 7791 Homo sapiens 9606 23724009
d014801 Tretinoin D014212 302-79-4 ZZZ3 26009 Homo sapiens 9606 33167477
d014801 Vitamin A D014801 11103-57-4 ACE2 59272 Homo sapiens 9606 32808185
d014801 Vitamin A D014801 11103-57-4 AKR1B1 231 Homo sapiens 9606 19014918
d014801 Vitamin A D014801 11103-57-4 AKR1B10 57016 Homo sapiens 9606 19014918
The file name is composed of the MeSHID and DATE that correspond to the query chemical and the query date (aaaa_mm_dd) respectively.
WikiPathways query output files
Rare disease pathways are retrieved from WikiPathways. Two GMT files are created:
This file contains the rare disease pathways in GMT format.
This file contains the human disease pathways in GMT format.
GMT file is a tab-separated file:
pathwayIDs: first column is the WikiPathways IDpathways: second column is the name of the WikiPathwaysHGNC: all the other columns contain genes inside the WikiPathways. The number of columns is different for each pathways and varies according the number of genes inside.
This is an example of this file:
pathwayIDs pathways HGNC
WP5195 Disorders in ketolysis ACAT1 HMGCS1 OXCT1 BDH1 ACAT2
WP5189 Copper metabolism ATP7B ATP7A SLC11A2 SLC31A1
WP5190 Creatine pathway GAMT SLC6A8 GATM OAT CK
The file name is composed of the query DATE (aaaa_mm_dd).
NDEx query output files
There are two ways to download biological network from NDEx in ODAMNet.
The first one is when you apply the active module identification (AMI) approach. You can use a biological network
directly downloaded from NDEx. You need to provide the --netUUID identifier. A SIF file will be
created. See Active Module Identification page for more details.
The second one is using the networkDownloading function. Providing the --netUUID identifier, you can download
biological networks in both SIF and GR format. See Network downloading page
for more details.
MMP11 PRPF40A
ASB16-AS1 SHBG
KIAA0513 INTS4
KIAA0513 HAX1
RAVER2 PTBP1
node_1 link node_2
MMP11 interacts with PRPF40A
ASB16-AS1 interacts with SHBG
KIAA0513 interacts with INTS4
KIAA0513 interacts with HAX1
RAVER2 interacts with PTBP1
Overlap analysis output files
In the Overlap analysis, only one type of file is created: Overlap_*.csv. Number of result files depends of the
chemical number given in the chemicals file.
This file contains ten columns:
PathwayIDs: Pathway IDPathwayNames: Pathway namePathwayBackgroundNames: Source of the pathway (e.g. Wikipathways)PathwaySizes: Number of genes inside the pathwayTargetSize: Number of target genes (i.e. that interact with chemical) that are in the background gene setIntersectionSize: Number of target genes that are inside the pathwayBackgroundSizes: Number of genes in the background gene sets (e.g. genes from all human pathways in WikiPathways)pValue: pvalue of the overlap between target genes and pathways/processes of interest (i.e. hypergeometric test)pAdjusted: adjusted pvalue (i.e. multitest correction)Intersection: list of genes shared between targeted genes and pathways/processes of interest (space-separated)
This is an example of this file:
PathwayIDs;PathwayNames;PathwayBackgroundNames;PathwaySizes;TargetSize;IntersectionSize;BackgroundSizes;pValue;pAdjusted;Intersection
WP4940;15q11.2 copy number variation syndrome;WikiPathway_2022_08_01;10;1721;0;12379;1.0;1.0;
WP4271;Vitamin B12 disorders;WikiPathway_2022_08_01;13;1721;0;12379;1.0;1.0;
WP4299;Lamin A-processing pathway;WikiPathway_2022_08_01;3;1721;0;12379;1.0;1.0;
WP4506;Tyrosine metabolism;WikiPathway_2022_08_01;4;1721;0;12379;1.0;1.0;
WP5223;2q21.1 copy number variation syndrome;WikiPathway_2022_08_01;42;1721;1;12379;0.9981605117974595;1.0;APC
WP4686;Leucine, isoleucine and valine metabolism;WikiPathway_2022_08_01;24;1721;2;12379;0.8660465002997586;1.0;BCAT1 BCAT2
See Overlap analysis page, Use-case 1 overlap analysis and Use-case 2 overlap analysis for more details.
AMI output files
The DOMINO_inputGeneList_*.txt file contains the input list of target genes using by DOMINO [4].
CCND1
CDKN1A
BAD
ESR1
KRT18
The three following files contain results of the AMI analysis. They give information about the identified active modules.
This file contains details about each identified active module found. It contains four columns:
source: node 1target: node 2link: kind of linkAMI_number: active module number
This is an example of the file:
source target link AMI_number
CDT1 MCM6 ppi 1
CDT1 CDK1 ppi 1
CDT1 ORC1 ppi 1
CDT1 MCM2 ppi 1
CDT1 GMNN ppi 1
Some metrics are calculated for each identified active module.
AMINumber: active module numberEdgesNumber: number of edges in the active moduleNodesNumber: number of nodes in the active moduleActiveGenesNumber: number of target genes
This is an example of the file:
AMINumber EdgesNumber NodesNumber ActiveGenesNumber
1 357 93 35
2 246 69 27
3 135 66 26
This file is created to import in Cytoscape [5] for the visualisation. It contains four columns :
GeneSymbol: Gene name
ActiveModule: active module number
ActiveGene: True if it’s target gene
overlapSignificant: True if the active module has significant overlap results
This is an example of the file:
geneSymbol ActiveModule activeGene overlapSignificant
NPAT 1 False False
CCNA1 1 True False
CDC6 1 True False
B3GALNT1 1 False False
USP26 1 False False
The three following files contain results of the Overlap analysis between identified active modules and pathways/processes of interest.
There are as many overlap files as identified active modules. This file contains the Overlap analysis results. See Overlap analysis output files for more details.
This file contains the significant overlap results between identified active modules and pathways/processes of interest. If two overlap are significant in several active modules, the best pvalue is conserved.
It contains 2 columns: pathways/processes of interest and best adjusted pvalue.
This is an example of this file:
WP5087 2.778369668213874e-25
WP4541 4.368084017694385e-07
WP4577 2.839118197421641e-06
WP5053 1.2298630252448874e-05
This file is created for the visualisation using Cytoscape [5]. It contains five columns:
geneSymbol: gene HCGN IDAM_number: Active module numbertermID: pathway/process ID (e.g. GO, WP, Reactome etc …)termTitle: pathway/process nameoverlap_padj: overlap adjusted pvalue
This is an example of this file:
geneSymbol AM_number termID termTitle overlap_padj
CEBPA 2 WP4879 Overlap between signal transduction pathways contributing to LMNA laminopathies 0.010978293424676187
CEBPB 2 WP4879 Overlap between signal transduction pathways contributing to LMNA laminopathies 0.010978293424676187
JUNB 2 WP4879 Overlap between signal transduction pathways contributing to LMNA laminopathies 0.010978293424676187
RUNX2 2 WP4879 Overlap between signal transduction pathways contributing to LMNA laminopathies 0.010978293424676187
CEBPA 2 WP4844 Influence of laminopathies on Wnt signaling 0.027997181221540435
CEBPB 2 WP4844 Influence of laminopathies on Wnt signaling 0.027997181221540435
RUNX2 2 WP4844 Influence of laminopathies on Wnt signaling 0.027997181221540435
CXCL5 6 WP5087 Malignant pleural mesothelioma 4.823470963219471e-11
FN1 6 WP5087 Malignant pleural mesothelioma 4.823470963219471e-11
See Active Module Identification page, Use-case 1 AMI analysis and Use-case 2 AMI analysis for more details.
RWR output files
In the RWR approach, the config_minimal.yml and seeds.txt input files are copy/paste into the output directory
results. See Configuration file for more details.
The other created files contain RWR results.
There are as many multiplex output files as multiplexes used in the RWR analysis. It contains RWR scores for each node and three columns:
multiplex: multiplex folder namenode: name of node inside the multiplex (e.g. target genes, pathways …)score: score calculated by the walk
This is an example of this file:
multiplex node score
1 VCAM1 0.0002083975629882177
1 FN1 0.00020345404504599346
1 EGFR 0.00020244600248388192
1 HSP90AB1 0.00020195660880228006
1 CTNNB1 0.0002014264852242386
1 TP53 0.00019080205293178928
1 MED1 0.0001875608976608657
1 EP300 0.00018540571477254143
1 SMAD3 0.0001852022345355004
The name of this network file depends on what you give in input (--sifFileName). See Random Walk with Restart for more
details. This file is created for the visualisation using Cytoscape [5]. This network file is a
[SIF format] and contains three columns:
source node: node nameslink source: source of the link (which multiplex or bipartite)target node: node names
This is an example of this file:
A8K1F4_HUMAN multiplex/1/PPI_Jan2021.gr TP53
A8K251_HUMAN multiplex/1/PPI_Jan2021.gr HSP90AB1
AAK1 multiplex/1/Reactome_Nov2020.gr EGFR
AARS multiplex/1/PPI_Jan2021.gr FN1
AARS multiplex/1/PPI_Jan2021.gr VCAM1
AATF multiplex/1/PPI_Jan2021.gr SMAD3
ABCE1 multiplex/1/PPI_Jan2021.gr VCAM1
ABCF1 multiplex/1/PPI_Jan2021.gr FN1
ABI1 multiplex/1/Reactome_Nov2020.gr MAPK1
ABL1 multiplex/1/PPI_Jan2021.gr EGFR
This file contains the list of the top X of pathways/processes of interests, according their RWR score.
You can choose the top number using the --top parameter.
This is an example of this file:
node score
WP5087 0.002847885875091137
WP4673 0.0009022865859048019
WP2059 0.0007759015708361376
WP5124 0.0007759015708361376
WP4298 0.0007690455140750499
See Random Walk with Restart page, Use-case 1 RWR analysis and Use-case 2 RWR analysis for more details.