Home   Browse   Search   Resources   Credits   Statistics   Help   Contact   GQRes  
Ramapo College Bioinformatics Group
GRSDB - The 'G'-Rich Sequences Database

GRSDB Help

Search Help Glossary of terms Understanding G-Scores Dealing with overlaps

Searching GRSDB

The main purpose of the search function in GRSDB is to identify gene(s) of interest, which will allow the user to further analyze the transcribed regions with respect to information about the composition of putative G-quadruplex sequences and their location relative to the RNA processing sites.

The search function contains the following user choices. Note: all searches are case-independent.

Organism:

At present, GRSDB contains information for two organisms: human and mouse. The user may perform a search on either organism by itself or all organisms (which is the default).

Search Field:

There are eight different fields that the user may search on. These are described in the table below.

Search Field

Comments

Gene Name

The official NCBI gene name

Accession Number

The NCBI accession number

GRSDB ID

The unique identifier used in GRSDB

Gene Identifier

The unique identifier used in NCBI

Chromosome Number

Which chromosome the gene is located on

Alternatively Spliced

Possible values: yes/no or equivalently, 1/0

Alternatively Polyadenyated

Possible values: yes/no or equivalently, 1/0

Number of Products

The number of alternatively spliced RNA products of the gene

Date of Entry

Date gene was uploaded into GRSDB, in the format yyyy-mm-dd

Logical Operators:

There are seven different conditions under which one can perform a search. The corresponding operators are described in the table below.

Operator

Comments

=

Used to find genes that exactly equal a user supplied expression for search field

>, <

Primarily used for the number of products.

Example : to find all genes in GRSDB with more than 2 products.

!=

Used to find genes that to do not match a user supplied expression.

Example: to find all human genes in GRSDB not on chromosome 10.

LIKE

Used to find all genes that are similar to a user supplied expression. Supports two wild cards: % which matches 0 or more characters and _ which matches exactly one character.

Example: If the user searches for all genes with gene names like CTS_ (genes that start with CTS followed by 1 character) then the search returns three genes: CTSH, CTSL, Ctsm

Example: If the user searches for all genes with gene names like AC% (genes that start with AC followed by anything) then the search returns seven genes: ACCN1, ACCN3, Acd, Ache, ACOX1, ACP1, ACPT

REGEXP

Used to find all genes that match a user supplied regular expression. For more on regular expression syntax see the Web site: http://www.regular-expressions.info/refernce.html

Example: If the user wants to find all genes with gene names starting with an A, followed by two occurrences of a C or D, use the regular expression ^A[C,D]{2}. This returns four genes: ACCN1, ACCN3, Acd, ADD1


Boolean Connectors

More complicated searches can be done by using the Boolean connectors AND, OR, XOR, AND NOT, OR NOT, XOR NOT. The use of Booleans is explained in the following table.

Boolean

Comments

AND

Used to find genes simultaneously satisfying two conditions.

Example: Find all genes that are alternatively spliced AND alternatively polyadenylated.

OR

Used to find genes that satisfy at least one of two given conditions.

Example: To find all genes that are alternatively processed use the query: “alternatively spliced OR alternatively polyadenylated.”

XOR

Used to find genes that satisfy exactly one of two given conditions.

Example: The query “alternatively spliced XOR alternatively polyadenylated” will find all genes having one of these conditions, but not both.

AND NOT

Used to find genes having one condition but not the other.

Example: Find all genes that are alternatively spliced AND NOT alternatively polyadenylated.

OR NOT

Equivalent to logical implication, so can be used to find all genes except for those for which the first condition is false and the second true.

Example: The query “alternatively spliced OR NOT alternatively polyadenylated” will find all genes except for those which are not alternatively spliced but are alternatively polyadenylated.

XOR NOT

Used to find genes for which both conditions are true or both conditions are false.

Example: The query “alternatively spliced XOR NOT alternatively polyadenylated” will find all genes which either have both properties or neither.


Query Results

The genes that meet the search criteria entered by the user are displayed in a table listing the GRSDB ID, Gene Name, Accession Number, whether the gene is alternatively spliced, and whether the gene is alternatively polyadenylated. The results may be sorted by any of these fields (for the Boolean fields, the genes having that condition are listed first).

Here is a screen shot of a particular query (to find all genes whose name begins with an A followed by a C or D and the gene has more than 2 products)


This query creates the following results page. There are 9 genes satisfying the above query.



Gene View

The user may click on any of the genes in the result table. This will lead to a page headed by a table with basic information about that gene. In addition, a table for each alternatively spliced RNA product of the gene is also displayed. The product table lists the product number and name (if known), the number of introns/exons and PolyA signals found in that product, the number of QGRS in the product and the number of QGRS found near RNA processing sites (generally within 120 nucleotides of such a site).

Beneath each product table are two buttons that allow the user to map the QGRS in either a data view or graphic view, as described below. In addition, it is also possible to analyze all products at once. Here is the gene view for the particular gene ACPT from the above example.



Data View

The data view for a particular product lists all non-overlapping QGRS found in that product, which exon or intron the sequence is found, the start position of the sequence within the gene, distance from the nearest 3' or 5' splice site (given in the notation 3:xxx or 5:xxx, where xxx is the distance), the actual sequence itself, and the G-score associated with the sequence (discussed in the glossary section). In addition the user may display all QGRS for the product (both overlapping and non-overlapping).

Here is a screen shot for the data view of product 1 of the ACPT gene.



Graphic View

This gives a visual display of the product. Shown below is a screen shot of the graphic view for alternatively spliced RNA product 1 of ACPT.

The top part of the graphic shows the location of exons/introns in the product, along with a scale to locate their positions. The QGRS in the product are indicated by the vertical bars, whose length is proportional to the G-scores of the QGRS. The bottom graphic represents a zoom-in of the RNA product 1 displaying the QGRS at position 3097. The arrows at the bottom left are used to navigate the RNA product with the interactive-zoom tool.

Note: It is possible to display all RNA products of a gene simultaneously in the “graphic view” by clicking on the appropriate button at the bottom of the Gene View page.This is a very useful feature for visual comparions of all the products. For example, it can be used to identify differential association of QGRS with alternative sites.

Search Help Glossary of terms Understanding G-Scores Dealing with overlaps
Home   Browse   Search   Resources   Credits   Statistics   Help   Top