To install the App, use the App Manager in Cytoscape (version 3.7.0 or later), click "Install from File...", and select the downloaded file.
To install the App, use the App Manager in Cytoscape (version 3.7.0 or later), click “Install from File…”, and select the downloaded file.

Querying Networks

Building Queries

BioGateway is an RDF Graph Database or ‘triple store’, and can be searched by building a query consisting of a set of questions that together specify what you are looking for. In this context, the “graph” is a network of nodes – shown as circles – and edges – the arrows connecting the nodes (see Example Graph Database, animation, to the right).

In the animation to the right, we see a representation of a mock network in a (very small) graph database of animals and some of their properties. The first question (part of a final query) selects for the animals that have the property of being “kept as” a “Pet”. This returns a subset of the network containing the nodes representing “Pet”, and the animals with edges of the type “Kept as” pointing to the “Pet” node.

By adding more question parts to the query, we can further specify the search. The second line in the query restricts the pets to those that are “Walking”, eliminating the Parrot from the results. And by further constraining the results to those pets that also are “Chasing” mice, we end up with a final query that results in the network shown in the last part of the animation.

Note that even though “Mouse” is not an animal satisfying all the conditions of the query, it is included because it is part of the relevant network for the query. 

Example Graph Database

The BioGateway database is powered by Virtuoso, and queryable through SPARQL, a graph query language based on the same principle as above. We construct a query by step-wise specifying the nodes we want, and the relationships / type of edges between them.

Each new line represents an additional part of the query, consisting of a subjectpredicate and object. Subjects and objects are always nodes, while predicates are edges. In BioGateway, these edges are also called relation types, because they represent a type of relation between two entities.

In BioGateway, the subjects and objects can either be bound to a specific value, like the node representing “Pet” or “Walking” in the animation above, or they can represent any value satisfying the conditions of the query. In the Cytoscape BioGateway App these unbound values are called “Sets”, as they represent the set of all values qualifying their part of the query.

The last query from the animation above could be formulated as a BioGateway query formulated as:

?animal kept_as Pet
?animal moves_by Walking
?animal chases Mouse
intro-animal-query

The “?animal” part is not bound to a specific value, but rather any value that can simultaneously satisfy all the parts of the query – i.e. be kept as a pet, and move by walking, and chase mice. The result of the query would be:

Cat   kept_as   Pet 
Cat moves_by Walking
Cat chases Mouse
intro-cat-chases-mice

This is because “Cat” is the only value of “?animal” that would satisfy all the constraints of our query. For a simpler query, like the initial one from the animated example:

?animal kept_as Pet

We would get all the matching pets as results:

Cat kept_as Pet 
Mouse kept_as Pet
Parrot kept_as Pet
Dog kept_as Pet
intro-all-pets

Example Tutorial

In the BioGateway App, we can use the Query Builder to build queries in a step-wise manner, using the same format as described above. A query consists of one or more lines, each specifying a selection of either a specific  node or set of nodes, and their relation with another specific node/ set of nodes. 

To see how this works, open the query builder, and load “Example 1a” (Figure 1).

load example query 3
Figure 1: open the Query Builder from the Control Panel and load the Example 1a
Figure 1: open the Query Builder from the Control Panel and load the Example 1a

This will load a query consisting of two lines into the Query Builder. If you do not get these results, make sure that your internet connection is working, and that you have the latest version of the BioGateway App.

Figure 2: the query builder after loading the Example 1a query
Figure 2: the query builder after loading the Example 1a query

The “Example 1a” query (Figure 2) shows two lines that specify and return: (1) all the genes that are a transcriptional target of the transcription factor protein FOXO4; and (2) all the proteins that are annotated to be involved in the GO Biological Process “response to hypoxia”. Clicking on the Run Query button will launch the query against the BioGateway backend and open the Query Results tab. For all the examples in this section we will select all the results (click in results table and press CtrA) and import them to a new network by clicking on “Import to New Network”.

Figure 3: query results highlighted for display as a network
Figure 3: query results highlighted for display as a network
Figure 4: the network that results from the “Example 1a” query, showing two sub networks: all proteins annotated with ontology term 'response to hypoxia'  (the central hub, left network), and all genes regulated by transcription factor FOXO3 (the network on the right)
Figure 4: the network that results from the “Example 1a” query, showing two sub networks: all proteins annotated with ontology term ‘response to hypoxia’ (the central hub, left network), and all genes regulated by transcription factor FOXO3 (the network on the right)

Note that the result of this query contains two separate clusters of nodes. There are two reasons for that:

  1. Lines 1 and 2 are asking for two independent and different sets (Set A in line 1, and Set B in line 2).
  2. Set A is composed of genes, while Set B is composed of proteins. It is important to bear in mind that Genes and Proteins are different entities in BioGateway.  

Let’s include an extra specification in the query as shown in “Example 1b”. Clicking on that query shows a third specification:

image17

The new line asks for all the proteins (Set C) being encoded by the genes in Set A. The results of this query shows an even larger network with many of the genes now functioning as hubs connected to many proteins, because of the fact that a gene can code for many proteins. In addition, the network is now fully connected, because some of the genes code for proteins that are annotated with the GO term selected in query line 2.  

Figure 5: the query builder after loading the Example 1b query and the resulting network
Figure 5: the query builder after loading the Example 1b query and the resulting network

To obtain a smaller network, in “Example 1c” the Set C in line 3 is redefined to Set B. This small difference now restricts the proteins encoded by Set A to those that are involved in the Biological Process “response to hypoxia”:

Figure 6: the query builder after loading the Example 1c query
Figure 6: the query builder after loading the Example 1c query
Figure 7: the results for the Example 1c query
Figure 7: the results for the Example 1c query

The resulting network shown in Figure 7 is now much more manageable, and to the point. (We used Cytoscape’s yFiles hierarchic layout algorithm for this figure.)

This network can be further extended by adding an additional query line, as shown in the “Example 1d” query. Line 4 specifies the following: return all the proteins (Set C) that interact with the proteins that are encoded by genes regulated by FOXO4 (Set B).

Figure 8: Example 1d adds a new set; Set C, which includes any protein interacting with the proteins in Set B
Figure 8: Example 1d adds a new set; Set C, which includes any protein interacting with the proteins in Set B
Figure 9: by adding the new Set C to the query with the "molecularly interacts with" relation type, the resulting network is expanded with many more nodes
Figure 9: by adding the new Set C to the query with the “molecularly interacts with” relation type, the resulting network is expanded with many more nodes

As can be seen in Fig. 9, by adding an extra selection to the query we have increased again the size of the network. One option to reduce the size again is shown in “Example 1e” (Fig. 10), where the interacting proteins in Set C are restricted to those involved in the Biological Process “response to hypoxia”.

Figure 10: the query builder after loading the Example 1e query
Figure 10: the query builder after loading the Example 1e query
Figure 11: the resulting network from the query in Example 1e
Figure 11: the resulting network from the query in Example 1e

The real value in the Query Builder is the ability to use it to formulate your own queries. An important help in this is the Autocomplete Search feature, which will help you to find the correct biological entities and relationships that you are interested in for your query.

Figure 12: Example of the Autocomplete function. When entering another entity name the autocomplete provides a drop-down list with matching entities, from which you may choose the appropriate one to build a query centered around that protein.
Figure 12: Example of the Autocomplete function. When entering another entity name the autocomplete provides a drop-down list with matching entities, from which you may choose the appropriate one to build a query centered around that protein.

With the help of the autocomplete search function you can quickly and efficiently redefine any of the biological entities or the relationships from a query. Figure 13 shows the results after replacing FOXO4 with MYC in line 1, and  the GO term “response to hypoxia” with ‘response to ionizing radiation” (line 2). This illustrates that all the examples from 1a to 1e can be used as templates for your own queries, and you will appreciate better the versatility of the App when tweaking these queries.

The next step is to learn how to create your own queries from scratch. For an introduction to that, please continue to the App Manual. More examples on how to use the BioGateway App to tackle Biological questions can also be found in the Use Cases page

Figure 13: By replacing FOXO4 with "MYC" in the 1st line and "response to hypoxia" with "response to ionizing radiation" in the 2nd line, we get a completely different network.
Figure 13: By replacing FOXO4 with “MYC” in the 1st line and “response to hypoxia” with “response to ionizing radiation” in the 2nd line, we get a completely different network.

Allowing Self-loops in the network

Screenshot 2019-06-03 at 14.57.04

The Exclude self-loops option in the Query Builder is by default activated. Queries can be run with or without including self loops, meaning results that stem from entities having relationships with itself, for instance a protein having a homodimer relationship. Including self loops often results in (many) more results. 

The effect of this option is illustrated in the figure to the right: load the Example 1e query in the Query Builder followed by unchecking the Exclude self-loops checkbox. This will now allow the network to contain self interactions of the nodes. 

Next, run the query and importing all the results shows a network contains several self-loops of proteins. 

A more detailed explanation of this Setting can be found in the App Manual.

Screenshot 2019-06-03 at 14.57.53