RBPome RBPome

An integrative resource of CLIP-seq studies

About CLIPdb

     CLIPdb is a collection of publicly available CLIP-seq data sets, which covers various CLIP-seq technologies and species. CLIPdb is helpful for quick navigation of the CLIP-seq data of your interests. Furthermore, CLIPdb provides RBP binding sites, which are generated from raw CLIP-seq data using identical computational method. Please refer to the CLIPdb paper for more detailed description for the database.

Figure 1

Figure 1. CLIPdb data flow and framework

Frequently Asked Questions (FAQ)

Q1: How to query CLIP-seq data sets?

The "binding sites navigation" module of the database contains four view tabs.

1."factor" view
Users can select the RBPs of interest, and obtain the detailed information of these studies displayed in a data matrix. Users can link to data source and literature for their further research conveniently. Notably, the RBP annotation matrices will be only available under the "factor" view.

2."species" view
Users can look at CLIP-seq datasets in specific species.

3."cell line" view
Users could search what RBPs have been profiled in specific cell lines.

4."technology" view
Users can search CLIP-seq studies according to their CLIP technologies.

Q2: How to download binding sites?

     Users can download binding sites from "binding sites navigation" module as well. Users can enter or select a gene name in the "factor" view. Then the database will return some CLIP-seq samples. For each CLIP-seq sample, the users can link out to raw CLIP-seq data resource in GEO. Furthermore, the database provides a BED file containing binding sites and their p-values (from Piranha software) of each CLIP-seq sample for download. The transcriptome-wide binding sites from each sample are annotated with genomic elements, which are summarized using pie charts.

     We provide "bulk download" for all binding sites for each RBP in the "factor view", and for each cell line type in the "cell line view", respectively. We simultaneously provide binding sites identified by Piranha and other specialized tools.

     Notably, for some CLIP-seq samples (14 data sets), their raw data could not be easily accessed or are of low quality according to our results, we did not provide their binding sites.

Q3: What do the columns of binding site BED file mean?

     The columns in the BED files are:
     1. chrom - chromosome name
     2. chromStart - the starting position of the binding site in the chromosome
     3. chromEnd - the end position of the binding site in the chromosome
     4. binding site name
     5. score - CLIP-seq read coverage within this binding site (Piranha and CIMS/CITS) or binding affinity score (PARalyzer)
     6. strand - "." means the binding site provides no strandness information. Users should overlap the coordinates of the binding site with those of transcripts to decide its strandness and parent transcript
     7. p-value - p-value retrieved from Piranha or other specialized tools (Piranha and CIMS/CITS)

Q4: How to search binding sites for a given gene?

     Users can enter or select a gene name in the search box in the "binding target search" module. Then the database will return some RBPs, which have binding sites in the gene. Users can further obtain detailed information about these binding sites, such as chromosome location, binding strength and p-value, or from which CLIP-seq sample/study.

Q5: How to visualize binding sites? What does "cutoff 0.01/0.001" mean?

     Users can visualize the transcriptome-wide binding sites using "browser" module. Users can choose different p-value cutoffs (0.01 or 0.001, assigned by Piranha software) for visualization. The smaller the p-value, the more unlikely the binding site represents a background.

     Users can choose specific RBPs and data sets through the "select tracks" button to load their transcriptome-wide binding sites. Users can also investigate RBP co-binding in the gene by simultaneously selecting multiple or all RBPs through the "select tracks" button.

Q6: How to obtain more detailed information of RBPs?

     Users can obtain more detailed information of RBPs from the "factor" view in the "binding sites navigation" module. The "summary" matrix contains the detailed information of the RBP, including the RBP factor's gene name, species, Ensembl ID, synonyms and RNA binding domain.

     For some RBPs, further information about their RNA binding motif would be accessible by clinking the arrow in the matrix. Users can obtain individual RNA recognition motifs, together with their assay method, motif evidence, PWM, motif logo, and source database. Notably, because RBP homologs share similar motifs, the RNA recognition motifs of some RBPs, which do not have direct experimental evidence, can be inferred through their homologs. We annotate these RBPs with the motif information from their most similar homologs (motif evidence: sequence similarity/species/homologous RBP).

Q7: How to quickly access CLIP-seq- and RBP-related computational tools or database?

     Users can quickly access CLIP-seq- and RBP-related computational tools or database through "tools" module. It provides various publicly available software and database for different usage. Users can easily linked out to these computational resources. We acknowledge the authors of these computational tools or database.

Reference reviews

     Ascano M, Hafner M, Cekan P, Gerstberger S, Tuschl T. 2012. Identification of RNA-protein interaction networks using PAR-CLIP. Wiley interdisciplinary reviews RNA 3(2): 159-177.

     Konig J, Zarnack K, Luscombe NM, Ule J. 2011. Protein-RNA interactions: new genomic technologies and perspectives. Nature reviews Genetics 13(2): 77-83.