5.4.1 Network analysis

The transcriptional regulatory network is a complex system of molecular interactions that governs gene expression. It consists of transcription factors and enhancer RNAs (eRNAs) that bind to specific DNA sequences and regulate the expression of target genes. The regulatory network is a dynamic system that can respond to various stimuli, including changes in the cell's environment or developmental signals. Alterations in gene expression patterns resulting from changes in the transcriptional regulatory network can lead to significant changes in cellular function.

5.4.2 What it does

The function of this module is to perform network analysis by utilizing source-target files containing regulatory relationships and filtered gene files, which are provided by the user.

5.4.3 Example

Exemplified below is the application of the network analysis for identification, visualization and comparison of enhancer-gene interactions in different cell lines. Here is the data for the example.

1․ Compute network based on expression file in a given sample and annotated functional connections of TF-gene and enhancer-gene.

nasap network_analysis --tf_source ./Homo_sapiens.GRCh38_tf_target.txt --enhancer_source ./Homo_sapiens.GRCh38_enhancer_target.txt --express_file ./express.csv --output_root ./test_output/

2․ Find the network community for a genomic region of your interest, for example, chr8:144700000->chr8:144900000. community

3․ Visualization of the enhancer-gene interaction on genome browser.

nasap network_links --specie human --gtf ./Homo_sapiens.GRCh38.93.gtf --forward_bw ./forward.bw --reverse_bw ./reverse.bw --region chr8:144700000-144900000 --output_root ./test_output/

track

4․ We can further identify differential enhancer-gene interactions if networks are built for two different samples, for example, HEK and Jurkat_T cells. cluster_compare

5.4.5 Parameters

Network_analysis

nasap network_analysis --help

At least one of --tf_source and --enhancer_source is needed for required parameter.

parameter description
--tf_source (Optional) transcriptional factor source file.
--enhancer_source (Optional) enhancer source file.
--select_nodes_file (Optional) filtered gene file.
--express_file (Optional) gene expression for filter gene in network.

--tf_source:
The document occupies one line for each regulatory relationship.

For example:

# source gene, target gene
tf1,gene1
tf1,gene2

--enhancer_source:
similar to tf_source file For example:

# source gene, target gene
enhancer1,gene1
enhancer2,gene2 

--enhancer_filter_nodes:
The genes in the file are retained and other genes in network are deleted. For example:

# gene name
tf1
enhancer1
gene1

--express_file:
expressed genes are used for network construction. For example:

# gene name, count
gene1,12
gene2,35

Extract the regulation data from database and visualization

parameter description
--specie (Required) specie
--region (Required) regulation region
--gtf (Optional) gtf file
--forward_bw (Optional) forward bw
--reverse_bw (Optional) reverse bw
--rpkm_file (Optional) rpkm file
--output_root output root

--specie:
The specie value must be included in this list: ['arabidopsis', 'chicken', 'chimp', 'fly', 'frog', 'mouse', 'human', 'rat', 'rhesus', 'sheep', 'worm', 'yeast', 'zebrafish']

--region:
For example: chr8:1000 - chr8:2000 must be written as chr8:1000-2000.
Only the same chromosome can be specified.

--rpkm_file:
rpkm file is used to filter only expressed genes in the region.

5.4.6 Results

Network measurement

Measure Value
nodes num 45387
edges num 2655783
mean degree 117.0283561372199
assortativity coefficient -0.25198190632480205
correlation coefficient -0.25198190632480194
transitivity 0.004697845986173299
density 0.0012892561157319428

Degree distribution

network degree distribution plot degree The statistical distribution of node degrees in a network.

Motif distribution

motif distribution plot
motif The plot counts the numbers of occurrences of the triadic motif profiles in networks.

Community analysis

In graph theory, a community refers to a subset of nodes within a graph where connections between the nodes are denser than connections with nodes outside of the subset. The module is designed to identify all communities within a network.