Code functions in details

Method functions

@author: Morgane Térézol. and Ozan Ozisik.

Methods can be applied to CTD and WP lists Adapted from overlapAnalysis.py from Ozisik et al., 2021

odamnet.methods_functions.overlap(targetGeneSet, pathOfInterestGenesDict, pathOfInterestNamesDict, pathwaysOfInterestList, backgroundGenesDict, featureName, outputPath, analysisName)

Calculate overlap between target genes and pathways of interest

Metrics :
  • M is the population size (Nb of genes inside WikiPathway for Homo sapiens pathways)

  • n is the number of successes in the population (Nb of genes inside the selected RD WP)

  • N is the sample size (Nb of genes shared between target list (from chemical) and background genes from WP)

  • x is the number of drawn “successes” (Nb of genes shared between target list and RD WP)

Parameters:
  • targetGeneSet (set) – Set of HGNC targets

  • pathOfInterestGenesDict (dict) – Dictionary of pathways of interest

  • pathOfInterestNamesDict (dict) – Dictionary of WP composed of title of them

  • pathwaysOfInterestList (set) – Pathways of interest list and their associated background name

  • backgroundGenesDict (set) – Dict of background genes

  • featureName (str) – Feature name (g.e. MeSH ID or chemical name etc.)

  • outputPath (str) – Folder path to save the results

  • analysisName (str) – Analysis name for the output name file

odamnet.methods_functions.overlapAnalysis(targetGenesDict, pathOfInterestGenesDict, pathOfInterestNamesDict, pathwaysOfInterestList, backgroundGenesDict, outputPath, analysisName)

Calculate overlap between target genes and pathways of interest.

Parameters:
  • targetGenesDict (dict) – Dict composed of interaction genes list for each chemical

  • pathOfInterestGenesDict (dict) – Dict of pathways of interest genes

  • pathOfInterestNamesDict (dict) – Dict of pathways of interest names

  • pathwaysOfInterestList (list) – Pathways of interest list and their associated background name

  • backgroundGenesDict (list) – Dict of background genes

  • outputPath (str) – Folder path to save the results

  • analysisName (str) – Analysis name for the output name file

odamnet.methods_functions.RWR(configPath, networksPath, outputPath, sifPathName, top)

Perform a Random Walk with Restart analysis using multilayers. You have to specify seeds and networks.

Parameters:
  • configPath (str) – Configuration file name path

  • networksPath (str) – Networks path name

  • outputPath (str) – Output folder path name

  • sifPathName (str) – Result file name path to write SIF result file

  • top (int) – Number of top results to report

odamnet.methods_functions.DOMINO(genesFileName, networkFileName, outputPath, featureName)

Run active modules identification analysis on the DOMINO server

Parameters:
  • genesFileName – Active genes file name (g.e. list of genes of interest)

  • networkFileName – Network file name

  • outputPath – Output path name to save the results

  • featureName – Feature name (g.e. chemical name)

Returns:

  • activeModules_list (list) – list of active modules identified

odamnet.methods_functions.DOMINOandOverlapAnalysis(featuresDict, networkFileName, pathOfInterestGenesDict, pathOfInterestNamesDict, pathwaysOfInterestList, backgroundGenesDict, outputPath, analysisName)

Run an active module identification for each target genes list Run an overlap analysis between identified active module and pathways of interest.

Parameters:
  • featuresDict – Dict of list of genes

  • networkFileName – Content of network file

  • pathOfInterestGenesDict – Genes dict of pathways of interest

  • pathOfInterestNamesDict – Names dict of pathways of interest

  • pathwaysOfInterestList – List of pathways of interest and their bg name associated

  • backgroundGenesDict – Dict of background genes

  • outputPath – Output path name to save results

  • analysisName (str) – Analysis name for the output name file

odamnet.methods_functions.DOMINOOutput(networkFileName, AMIFileName, featureName, outputPath)

Create output file of the active module identification analysis

Parameters:
  • networkFileName – Content of network file

  • AMIFileName – AMI results file name

  • featureName – chemical ID

  • outputPath – Output path name to save results

odamnet.methods_functions.createNetworkandBipartiteFiles(bipartiteName, networkName, pathOfInterestGenesDict)

Create a bipartite between target genes and pathways of interest Create a disconnected network between pathways of interest

Parameters:
  • bipartiteName (filename) – Bipartite file name

  • networkName (FILENAME) – Network file name

  • pathOfInterestGenesDict (dict) – Dict of pathways of interest

odamnet.methods_functions.downloadNDExNetwork(networkUUID, outputFileName, simplify)

Download network from NDEx website Create a tab separated file with three columns: node1, interaction type and node2 With header (SIF format)

Parameters:
  • networkUUID (str) – Network ID

  • outputFileName (FILENAME) – SIF file name to write network

  • simplify (boolean) – if True, remove header and the interaction column

CTD functions

@author: Morgane Térézol.

CTD functions

Manage the target genes list retrieval. Target genes list could come from: - a chemicals file and requested from CTD - a CTD file (file created by request CTD) - a list of target genes

odamnet.CTD_functions.readFeaturesFile(featuresFile)

Read a list file (composed of gene names or chemical names).

Parameters:

featuresFile (FILE) – Content of the features file

Returns:

  • featureNamesList (list) – List of feature names

odamnet.CTD_functions.readCTDFile(CTDFile, nbPub, outputPath)

Read CTD file, created from a request.

Parameters:
  • CTDFile (FILE) – Content of the CTD file

  • nbPub (int) – Minimum number of publications to keep a chemical-gene interaction

  • outputPath (PATH) – Output path directory name

Returns:

  • targetGenesDict (dict) – Dictionary of genes for each chemical as query

odamnet.CTD_functions.CTDrequest(chemName, association, outputPath, nbPub)

Request CTD database.

Search all genes which interact with chemicals given in input. Could be several chemicals names in the same line. Analysis will be done like if it’s only one chemical. If hierarchicalAssociations is used, chemical related to the chemical given in input are used as query. Focus on genes present in Homo sapiens.

Parameters:
  • chemName (str) – Chemical name in MeSH ids string

  • association (str) – Association name (hierarchicalAssociations or directAssociations)

  • outputPath (str) – Folder path to save the results

  • nbPub (int) – Minimum number of publications to keep a chemical-gene interaction

Returns:

  • homoGenesList (list) – List of genes which interact with chemicals given in input (only Homo sapiens)

  • chemMeSH (str) – Composition of MeSH ID from chemicals given in input

odamnet.CTD_functions.CTDrequestFromFeaturesList(chemList, association, outputPath, nbPub)

Make CTD request for each chemical present in the list given in input. Each element can be composed of one or more element. If several element, the analysis will be done like if there is only one chemical.

Parameters:
  • chemList (list) – List of chemical to request to CTD (MeSH IDs or chemical names)

  • association (str) – Association name (hierarchicalAssociations or directAssociations)

  • outputPath (str) – Folder path to save the results

  • nbPub (int) – Minimum number of publications to keep a chemical-gene interaction

Returns:

  • chemTargetsDict (dict) – Dict composed of interaction genes list for each chemical

odamnet.CTD_functions.targetGenesExtraction(chemicalsFile, directAssociations, outputPath, nbPub)

Read chemicals file Request CTD and extract target genes Save results into output file Return the gene targets list

Parameters:
  • chemicalsFile (FILE) – Content of the chemicals file list

  • directAssociations (bool) – Chemical only or descendants too

  • outputPath (PATH) – Folder path name to save results

  • nbPub (int) – Minimum number of publications to keep an interaction

Returns:

  • chemTargetsDict (dict) – Dict composed of interaction genes list for each chemical

WP functions

@author: Morgane T.

WikiPathways functions

odamnet.WP_functions.readRequestResultsWP(WPrequestResult)

Read request from WP.

Parse and extract information from request. Extract genes, names and IDs of pathways.

Parameters:

WPrequestResult (bytes) – Request result from WikiPathway

Returns:

  • WPgenesDict (dictionary) – Dict of genes for each WikiPathway

  • WPnamesDict (dictionary) – Dict of titles for each WikiPathway

odamnet.WP_functions.rareDiseasesWPrequest(outputPath)

Function requests WikiPathway database.

Search all WikiPathways related to Rare Diseases. Focus on pathways related with Homo sapiens. Write results into result file.

Parameters:

outputPath (str) – Folder path to save the results

Returns:

  • WPgenesDict (dictionary) – Dict of genes for each RD WikiPathway

  • WPnamesDict (dictionary) – Dict of names for each RD WikiPathway

  • pathwayOfInterestList (list) – Pathway names list

odamnet.WP_functions.allHumanGenesFromWP(outputPath)

Extract all gene HGNC ID from Homo sapiens WP. Write request result into output file.

Parameters:

outputPath (str) – Folder path to save the results

Returns:

  • backgroundsDict (dict) – Dict of all human genes from WP

odamnet.WP_functions.readGMTFile(GMTFile)

Parse and extract information from GMT file.

Parameters:

GMTFile (FILE) – Content of GMT file

Returns:

  • pathOfInterestGenesDict (dict) – Dict of genes for each pathway of interest

  • pathOfInterestNamesDict (dict) – Dict of names for each pathway of interest

  • pathwaysOfInterestList (list) – Pathway names list

odamnet.WP_functions.readBackgroundsFile(backgroundsFile)

Read a backgrounds file Each line contains a background file name source correspondant of each pathway of interest The order of sources depends on the order of pathways of interest.

Parameters:

backgroundsFile (filename) – File name of the background source of each pathway of interest

Returns:

  • backgroundsDict (dict) – Dictionary of the background genes from different sources

  • backgroundsList (list) – List of the background gene sources to use