Hi David,
I see several problems with this idea. First the overwrite of human-added text previously mentioned.
Did you also have a problem with the proposed response?
http://forum.citizendium.org/index.php/topic,697.msg5853.html#msg5853Second, there is nearly no information for most of the (human) genes. Although there are a lot of geneticists, they tend, I think, to mostly work on established genes, paying particular attention to oncogenes, for example. Because of our current lack of knowledge in a systems biology sense, much of the gene ontology is likely to be wrong and will need to be updated, getting back to first point again. Also the problem with multiple reading frames for some genes.
Agreed that many genes are virtually unannotated, which is why I proposed a rough guessitmate of 10k gene pages (as opposed to the full set of ~25k mammalian genes). I disagree, however, that "much of the gene ontology is likely to be wrong". GO annotation is both incomplete and imprecise (i.e., very general annotation, like "cellular metabolism"), but I'd hesitate to say inaccurate. Do you have any specific examples or studies to support this? Regardless, even incomplete and imprecise annotation will require updates, and I hope the proposal in the previous post addresses those concerns.
Also agreed that most genes have multiple possible protein products, but like most of the gene databases available now (and much of the literature), I propose we start with this gene-level of abstraction.
How about a thought experiment: make a bot to extract everyword out of a good dictionary for each language on earth, make stubs for all of them and see how many people volunteer to upgrade all of the stubs and inter-relate them?
If someone proposed a similar thought experiment five years ago replacing "every word out of a good dictionary" with "every entry out of a good encyclopedia", I might have been skeptical too. But I am continually surprised and impressed by people's desire to share knowledge. And molecular biology in particular I think is ready for this idea. Of course, reasonable people can disagree on that, which is why we're very open to suggestions on how to improve the idea. But personally, I'm in too deep, so not doing it is not an option

(so long it doesn't harm the larger WP/CZ efforts).
I did like the example gene page, but only because a lot of information is available for the particular gene that was selected, including a 3D structure of the protein. The final point is what kind of label will you put on a open reading frame with no known function and no name other than ORF #####? How does that help the layman?
Agreed, let's start with genes (~10k?) that have at least a framework to hang free-text annotation off of, and the richness of these stubs will vary widely. In the upcoming pilot experiment for ~10 genes, we'll get a range in degree of annotation.
(And of course, I agree with David Goodman's thoughts on using the least restrictive license possible.)
Cheers,
-andrew