Frequently Asked Questions

NOTE: Many of the entries in the FAQ below apply to the GeneAtlas Version 1 Dataset (PNAS 2002). While some have been updated to reflect our GeneAtlas Version 2 Dataset (PNAS 2004), if it is not specified then assume the entry has not yet been updated...

1. What do the 'Median' and 'Med*X' notations mean (GeneAtlas)? What do the black, blue, and red lines denote (Symatlas)?
2. How many samples are there per tissue/organ? How are the error bars determined?
3. Can I view the absolute call (absence/presence) from Affymetrix?
4. What does "Average Difference Value" mean? Why are there negative values (in the primary data file)?
5. How do I cite the gene expression database?
6. Is the primary data available for download?
7. How is the data transformed? How are negative values treated?
8. Why do I sometimes see a "No Image Available" note?
9. Why do I sometimes see a broken image icon (white box with red X)?
10. Should I use a web spider to download all of the data and/or images?
11. Why doesn't the Affymetrix annotation agree with the [Locuslink/Unigene/SwissProt/etc.] annotation?
12. Where can I download the annotation files?
13. Why do some images only show expression data for 10 tissues?
14. How do you generate your images?
15. How to you scale/normalize your data?
16. Where can I find the probe sequences for a probe set? What do the _g, _x, or _s flags mean?
17. Why do I get "Warning: Page has Expired" errors when I try to use the forward and back buttons?
18. Can I use deep links into GeneAtlas from my application?
19. Can I obtain a commercial license to access/view GeneAtlas? Are there access restrictions to the GeneAtlas web site?
20. What are future plans as far as new features and data sets?
21. Why don't I see a sumbit button? What browser should I use?
22. Have you deposited your data into any of the public repositories?
23. When I search for GeneX, why are there multiple genes returned? Why might they show a different expression profile?
24. What are the origins of the tissues you profiled?


1. What do the 'Median' and 'Med*X' notations mean (GeneAtlas)? What do the black, blue, and red lines denote (Symatlas)?
In GeneAtlas (http://expression.gnf.org), the median value is calculated for each gene across all tissues. Med*3 and Med*10 are simply multiples of the median. In Symatlas, the black, blue, and red lines correspond to the median, 3*median, and 10*median, respectively.


2. How many samples are there per tissue/organ? How are the error bars determined?
This information is available under the sample description links. For most samples, duplicate samples or duplicate hybridizations were perfomred. For some cases, more than two experiments are averaged, and in other cases only one chip experiment is shown (no error bars).


3. Can I view the absolute call (indicating absence/presence) from Affymetrix?
We currently do not provide this information in any of our precomputed displays. However, sometime in the near future we will be adding these calls to the primary data files for download.


4. What does "Average Difference Value" mean? Why are there negative values (in the primary data file)?
These are average difference values computed by Affymetrix software. These values are proportional to mRNA content in the sample. Negative values are an artifact of Affymetrix's MAS4 probe condensation algorithm. Essentially, it means the estimate of cross hybridization exceeds the specific signal intensity. For more details on this, visit Affymetrix's web site. We treat negative values as noise and clip them at 20.


5. How do I cite the gene expression database?
The manuscript describing this resource has been published at PNAS. Please cite
this reference.


6. Is the primary data available for download?
Yes! For our new "Version 2" GeneAtlas (published PNAS 2004), please see
this link. For the original GeneAtlas data (published PNAS 2002), you can download the human data and mouse data here. (MAS5 version of human, as shown in SymAtlas, is here.)


7. How is the data transformed? How are negative values treated?
All values that are less than twenty are clipped (set to twenty).


8. Why do I sometimes see a "No Image Available" note?
There are two possibilities. For the mouse MGU74A array, roughly 25% of probe sets were retired by Affymetrix as being incorrectly designed probes. For the human U95 array set, only the U74A array was profiled out of the five array U95A-E set. In both cases, we're including the annotation information of all probes, even if the tissue distribution of gene expression has not been determined.


9. Why do I sometimes see a broken image icon (white box with red X)?
There is a bug with Internet Explorer that causes it to not display a very small percentage of images (reason unknown). Netscape Navigator does not appear to have this problem. If you encounter this, please email us the probe set id. Known offenders: Mouse(101055_at)


10.Should I use a web spider to download all of the data and/or images?
No! No! No! If you would like all of the PNG images files, let us know and we will send them to you via ftp! No sense in clogging up network traffic at both of our sites. And we're nice people, we promise. We reserve the right to ban IP addresses of offenders if this becomes a problem...


11. Why doesn't the Affymetrix annotation agree with the [Locuslink/Unigene/SwissProt/etc.] annotation?
The Affymetrix annotation line is generated at the time of chip design, and sequence annotation changes quite frequently. The annotation links we've dervied to the public databases are periodically regenerated using BLAST and NCBI linking tables.


12. Where can I download the annotation files?
The annotation files can be downloaded here:
human mouse. Alternatively, go use Affymetrix's NetAffx site.


13. Why do some images only show expression data for 10 tissues?
The U74 set of Affymetrix mouse arrays comes in a set of three: U74A, which contains mostly characterized genes, and U74B-C, which contain mostly EST clusters. We ran the full panel of ~45 tissues on U74A, and only a sampling of ~10 tissues on the other two chips. Since we continually reannotate the probe sets based on the latest sequence data, some of those previously uncharacterized EST clusters now map to known genes.


14. How do you generate your images?
The images are pre-generated (not dynamically generated) using
Perl, GD (image generation libraries), and GD.pm (the Perl wrappers to the libraries). GD.pm has a bar chart object that makes it easy to generate images of the type shown in the GeneAtlas web site. Error bars are not in the GD.pm package, so we wrote a minor extension (kludge) of the package to allow for that.


15. How to you scale/normalize your data?
We used the scaling algorithm that is standardly used in the Affymetrix software. In short, it takes all intensity values from a chip image, removes the top and bottom 2% or probes, and scales all values such that the average of the remaining probes equals some user defined number (200, for our dataset). The files that you can download from our web site have already been scaled.


16. Where can I find the probe sequences for a probe set? What do the _g, _x, or _s flags mean?
To look up details of a particular probe (including probe sequences), use Affy's
NetAffx query tool. You might need to register and login for this feature. If you just want a short answer as far as the _s and _x notation, I don't think this link requires login.


17. Why do I get "Warning: Page has Expired" errors when I try to use the forward and back buttons?
This is a current technical limitation. The trade-off for making this go away is having a theoretical limit on the number of probe sets that could be displyed, so I opted for functionality over convenience. But realizing how annoying it is, I poked around and found that newer/smarter browsers like
Mozilla don't have this problem. And while I'm on the subject, after using Mozilla for a couple weeks, I have nothing but good things to say... Try it today...


18. Can I use deep links into GeneAtlas from my application?
Absolutely. For the appropriate deep-linking syntax,
contact us.


19. Can I obtain a commercial license to access/view GeneAtlas? Are there access restrictions to the GeneAtlas web site?
Sorry, commercial access is not available due to our own licensing restrictions that we must abide by. For additional details, please see the
Terms of Use.


20. What are future plans as far as new features and data sets?
We can't make any specific promises about timelines or features, but rest assured that we are continually planning improvements and additions. If you have specific ideas and/or requests,
email us.


21. Why don't I see a sumbit button? What browser should I use?
Difficulties with Internet Explorer 5.x on Mac have been reported. Please try a different browser/platform combination, and if the problem persists, please
email us.


22. Have you deposited your data into any of the public repositories?
Yes, we have deposited these data into NCBI's
GEO.


23. When I search for GeneX, why are there multiple genes returned? Why might they show a different expression profile?
The GeneAtlas uses commercially available gene expression arrays. In some cases, Affymetrix designed multiple probe sets to one gene at the time of chip design (for example, to interrogate splice variants, or because a single ideal probe set couldn't be found). In other cases, Unigene clusters at design time have been subsequently merged in our re-annotation process (
see Question #11). Differences in expression pattern between multiple probesets that are annotated as the same gene can be due to a variety of technical and scientific reasons (including poorly designed probesets, cross hybridization, etc.).


24. What are the origins of the tissues you profiled?
This information is contained in our sample annotation page for
human and mouse.


Last modified: June 9, 2003
Additional questions? Email us.