Identifier Mapping

This protocol will show you how to map or translate identifiers from one database (e.g., Ensembl) to another (e.g, Entrez Gene). This is a common requirement for data analysis. In the context of Cytoscape, for example, identifier mapping is needed when you want to import data to overlay on a network but the keys in the data don't match those in the network. This protocol includes two distinct examples highlighting different lessons that may apply to your use case; species-specific mapping and protein to gene mapping.

Detailed information about the Cytoscape ID Mapper tool is available in Identifier Mapping in Cytoscape: idmapper (F1000 Research)

Species-specific Mapping

Background

When planning to import data, you need to consider the key columns you have in your network data and in your table data. It is recommended to always use proper identifiers as your keys (from public databases like Ensembl or Uniprot-TrEMBL). Relying on conventional symbols and names is not standard and error prone.

For this example, we are going to use the Yeast Perturbation sample network provided with Cytoscape, which can be loaded from the Starter Panel. This network has identifiers of the form YDL194W, which are Ensembl-supported identifiers for Yeast.

Species-specific Mapping

Load Network

  • When you first launch Cytoscape, the Starter Panel is visible in the main Network View Window, with short cuts to sample files. Open the Yeast Perturbation sample file. The Starter Panel is also accessible from View → Show Starter Panel.
  • If you look in the Node Table, you’ll see that there are proper identifiers in the name columns, like YDL194W.

Species-specific Mapping

Identifier Mapping

We are going to use the ID Mapper functionality in Cytoscape to map the Yeast Ensembl IDs in the name column to Entrez Gene IDs.

  • In the Node Table, right-click on the column header of the name column and click Map column....
  • In the ID Mapping interface, select Yeast as Species, Ensembl as Map from and Entrez Gene as To.

Species-specific Mapping

That's it! A new column (all the way to the right) will be added to the Node Table. You could now use this column to map data annotated with Entrez Gene IDs to the network.

Protein to Gene Mapping

Load Network

For this use case, you’ll need the STRING app to access the STRING database.

  • Install the stringApp from the Cytoscape App Store, or install from Cytoscape via Apps → App Manager ....
  • In the Network Search bar at the top of the Network Panel, select STRING disease query from the drop-down, and type in breast cancer.
  • Click the options icon and set the Confidence (score) cutoff to 0.9 and the Maximum number of proteins to 150. Click the search icon to search.

Protein to Gene Mapping

Identifier Mapping

  • In the Node Table, right-click on the column header of the canonical name column and click Map column....
  • In the ID Mapping interface, select Human as Species, Uniprot as Map from and Ensembl as To.

A new column (all the way to the right) will be added to the Node Table.

Protein to Gene Mapping

Importing Data

Now that we have mapped our network nodes to Ensembl IDs, we can import data that is annotated with Ensembl IDs.

  • Load the expression data file under File menu, select Import → Table from File..... Alternatively, drag and drop the data file directly onto the Node Table.
  • Under Key Column for Network, select the new Ensembl column.
  • Click OK to import.

We can now visualize the expression values on the network as node fill color. If we add a simple Continuous Mapping for node Fill Color, remove the STRING glass ball effect, add a darker node border, and apply a force-directed layout, the network will look like this:

Learn more about Importing Data

Learn more about Styles

BridgeDb App

The built-in identifier mapping function is intended to handle the majority of common ID mapping problems, but it has limitations. If you need an ID mapping solution for species or ID types not covered by this tool, or if you want to connect to alternative sources of mappings, check out the BridgeDb app.

BridgeDb supports ID mapping resources from delimited text, BridgeDb files, BrigeDb web service, and BioMart web service.