One of the main paradigms embraced by the KnowEnG Center is the concept of “knowledge-guided analysis,” in which researchers can analyze their own data in the context of publicly available data. Primary sources include gene expression data sets, gene homology relationships, protein-protein and gene-gene interactions, gene ontologies, and literature based relationships.
Knowledge Network (KN) research works to develop a pipeline which can produce a heterogeneous network, termed the “Knowledge Network,” which functions as a compendium of community data sets which is ready for computation and investigation.
This pipeline uses a scalable cloud infrastructure to automatically check our selected data sources for updates, download and parse relevant information, map gene aliases to a set of stable identifiers, and import the resulting information into databases which store the network. Nodes in the resulting network represent genes, and their respective transcripts and proteins, and properties while edges correspond to relationships between them.