Rice GeneCloud/Keywords - semantic terms overrepresented in description of a gene list

This script uses a semantic technique to scan over gene descriptions of a particular gene list (that you provide). It returns only semantic terms, on a ‘word cloud’ format, that are significantly more represented in the list of genes compared to the whole genome background. It tends to be very powerful and gives complementary yet similar results with GO term enrichment analysis. Compared to GO analysis, it does not care about categories. Each word in a description is handled on itself. If this word occurs more than randomly it is kept and displayed. For instance if your list of gene is enriched of the term "myb", "hairs", or "endomembrane" it will be displayed.

Querie can be single words, sentences with Boolean operators:
AND: Finds genes that contain in their description terms on both sides of the operator (the intersection of both searches).
OR: Finds genes that contain in their description either term (the union of both searches).
NOT: Finds genes that contain in their description the term on the left but not on the right hand side of the operator. search from the one on the left.
Sentences in quotation marks.
Processes all Boolean operators in a left-to-right sequence.
Modification of the code (May 20th 2015):
According to user feedback the size of the words displayed in the Cloud is now proportionnal to "occurence x ratio of enrichment".

If you publish these results please cite:
GeneCloud reveals semantic enrichment in lists of gene descriptions
Gabriel Krouk, Clément Carré*, Cécile Fizames*, Alain Gojon, Sandrine Ruffel & Benoît Lacombe
Mol. Plant 8(6):971-3, Jun 2015
 This R script was written by Gabriel Krouk, Clément Carré & Cécile Fizames. Interactive use on this web server is free to academic use.
