Citizendium Forums
October 21, 2014, 05:15:58 UTC *
Welcome, Guest. Please login or register.

Login with username, password and session length
News: This forums is now a read-only archive. Project members may post on the new forum. Non-members may use only the "Open Forum" group, but still must register before posting (it's easy!). Posts will otherwise be deleted.
To edit your displayed name, click on Profile > Account Related Settings. To edit your signature, click on Profile > Forum Profile Information.
Click here to return to the wiki
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: Gene articles and bots  (Read 20217 times)
Andrew Su
Forum Member
**
Posts: 11


« on: March 26, 2007, 18:18:37 UTC »

All,

I recently approached the folks in the Molecular and Cellular Biology project at Wikipedia about a proposal to create, in an automated way, stubs for ~10,000 mammalian genes in parallel.  I posted a summary of the proposal on my wikipedia user page with links to some of the primary discussions:

http://en.wikipedia.org/wiki/User:AndrewGNF
http://en.wikipedia.org/wiki/User:ProteinBoxBot (updated link 4/3/2007)

In short, I proposed organizing structured data (synonyms and aliases, genome locations, gene function, etc.) from many public databases, and creating stub pages with infoboxes summarizing these data.  (These stubs ideally will serve as seeds for people to contribute the more non-structured data for which wikis are a great tool.)  In a parallel non-wiki project, we have done the data integration effort so now we are exploring how to create gene stubs that are most useful for the Wikipedia community.

The work-in-progress gene stub example is here: http://en.wikipedia.org/w/index.php?title=IL2-inducible_T-cell_kinase

The opening of the citizendium project presents the possibility of doing this effort here in addition to or instead of at Wikipedia.  As I see it, this possibility rests on two initial questions -- would this be desirable to the CZ community, and is there a bot policy to facilitate this?

Comments/suggestions are welcome...

Cheers,
-andrew
« Last Edit: April 04, 2007, 01:54:55 UTC by Andrew Su » Logged

Chris Day
Forum Regular
*****
Posts: 1068



« Reply #1 on: March 26, 2007, 18:49:38 UTC »

All,

I recently approached the folks in the Molecular and Cellular Biology project at Wikipedia about a proposal to create, in an automated way, stubs for ~10,000 mammalian genes in parallel.  I posted a summary of the proposal on my wikipedia user page with links to some of the primary discussions:

http://en.wikipedia.org/wiki/User:AndrewGNF

In short, I proposed organizing structured data (synonyms and aliases, genome locations, gene function, etc.) from many public databases, and creating stub pages with infoboxes summarizing these data.  (These stubs ideally will serve as seeds for people to contribute the more non-structured data for which wikis are a great tool.)  In a parallel non-wiki project, we have done the data integration effort so now we are exploring how to create gene stubs that are most useful for the Wikipedia community.

The work-in-progress gene stub example is here: http://en.wikipedia.org/w/index.php?title=IL2-inducible_T-cell_kinase

The opening of the citizendium project presents the possibility of doing this effort here in addition to or instead of at Wikipedia.  As I see it, this possibility rests on two initial questions -- would this be desirable to the CZ community, and is there a bot policy to facilitate this?

Comments/suggestions are welcome...

Cheers,
-andrew

Hey Andrew, i have been watching your discussions with Tim and co and I think your idea is excellent. At present there are no automated bots running on citizendium, that i know of, but this would be required for the proposal.  We have also considered using the public databases as a start to establish the tree of life related articles (mooted in a seperate thread). I'll go back and re-read your proposals again to get myself up-to-speed.  Notice that the copyright licenses are slightly different here.
Logged

Zachary Pruckowski
Forum Communicator
****
Posts: 933


« Reply #2 on: March 26, 2007, 19:14:25 UTC »

Hey Andrew, i have been watching your discussions with Tim and co and I think your idea is excellent. At present there are no automated bots running on citizendium, that i know of, but this would be required for the proposal.  We have also considered using the public databases as a start to establish the tree of life related articles (mooted in a seperate thread). I'll go back and re-read your proposals again to get myself up-to-speed.  Notice that the copyright licenses are slightly different here.

Correct.  There are currently no bots* on CZ.  Software to do what you're describing exists, and can be ported from Wikipedia versions.  If you decide you want to do this, we'll have to set it up and run it from CZ's servers, simply because (no offense) it could otherwise be a security risk.  I think that we'd prefer access at the wiki level versus access at the DB level, which should be fine for your purposes.

* = Jason (our technical lead) hates the word "bot", so we'd probably call it something else.  Bot has a very negative connotation in the IT field.
Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #3 on: March 26, 2007, 23:28:59 UTC »

Quote
Notice that the copyright licenses are slightly different here.

Hmmm, didn't notice that originally but this prompted me to do a little reading.  Is it correct to say that there is no consensus yet on the exact license, including use by commercial institutions?  Although our intent on this project is to be as open and "academic" as possible, GNF itself is not non-profit.  If we would be prohibited from incorporating CZ content into our gene portal, then this is pretty much a non-starter...  Anyway, if there is a specific license agreement that I just haven't found, please point me in the right direction...

Quote
we'll have to set it up and run it from CZ's servers, simply because (no offense) it could otherwise be a security risk. 

Not sure how it'd be a security risk, since the b*t would essentially be screen scraping and inheriting the permissions of its user account.  But anyway, no objection in principle for running off of CZ's servers.  And of course, we're happy to refer to this hypothetical b*t by whatever name is perferred... Wink

Cheers,
-andrew
Logged

Zachary Pruckowski
Forum Communicator
****
Posts: 933


« Reply #4 on: March 27, 2007, 00:20:44 UTC »

Quote
we'll have to set it up and run it from CZ's servers, simply because (no offense) it could otherwise be a security risk. 

Not sure how it'd be a security risk, since the b*t would essentially be screen scraping and inheriting the permissions of its user account.  But anyway, no objection in principle for running off of CZ's servers.  And of course, we're happy to refer to this hypothetical b*t by whatever name is perferred... Wink

I may have misunderstood you.  I thought you meant that the bot would also be creating articles.  If it just wants to read, then it can live wherever it wants.  If it wants write access however, then security considerations come into play (not that we don't trust you, just that we need to keep tabs on any sort of automated editing).
Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #5 on: March 27, 2007, 00:36:54 UTC »

Sorry, I think I've muddied the water here.  Yes, the bot would be creating and editing articles.  By "screen scraping" I meant that it would be editing via CGI get and post within the context of a bot user account (as opposed to any sort of API or or DB-level access).  And I completely understand the rationale of tracking and regulating bots in a similar way to WP -- absolutely no objection here.

Anyway, this is all jumping the gun.  Before the technical/regulatory issues, I think first we need to determine if this proposal is desirable within the scope of the CZ Biology project.  And for that, I'm excited to hear feedback from Chris and the other Biology authors and editors...
Logged

Chris Day
Forum Regular
*****
Posts: 1068



« Reply #6 on: March 31, 2007, 16:50:20 UTC »

And for that, I'm excited to hear feedback from Chris and the other Biology authors and editors...

Hi Andrew, so i was looking at your templates in wikipedia and they look excellent. It is a good idea connecting with the gene ontology keywords. It is a great stating point for unifying the information on different genes. The best thing is that the updating is then not dependant on CZ or WP but on the respective host sites. Thgis means everything is kept as updated as possible.

Is your long term goal to stick with mice and humans? Or is there a way to tie in the mess of gene nomenclature for all species?
« Last Edit: March 31, 2007, 16:53:52 UTC by Chris Day » Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #7 on: April 01, 2007, 21:54:05 UTC »

Quote
Is your long term goal to stick with mice and humans? Or is there a way to tie in the mess of gene nomenclature for all species?

Our institute's focus (and my personal interest) is on mammalian biology, and the database that we've developed to collate all gene annotation from the public domain is focused on human and mouse.  So yes, for the forseeable future, our emphasis will be on those two organisms.  Technically speaking, I think if someone else were more interested in adding content for another organism, it probably would be pretty straightforward for that person to adapt their data to use the bot that we develop.  Scientifically, however, I think this is a pretty sticky issue that will need further thought.  Between human and mouse, it's relatively easy to assign orthologs (the "same gene" in different species), but as you get to more and more distant organisms one of course has to deal with vast gene family expansion, functional "drift", etc.  Anyway, for this reason (and because we need to start somewhere), we're just going to commit to doing mouse and human right now...
Logged

Chris Day
Forum Regular
*****
Posts: 1068



« Reply #8 on: April 02, 2007, 01:16:41 UTC »

Anyway, for this reason (and because we need to start somewhere), we're just going to commit to doing mouse and human right now...

I agree that this is the best approach but i was thinking along the lines of crop plants, later. You are at Novartis, right? Or is that my mistake?
Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #9 on: April 02, 2007, 17:57:10 UTC »

Quote
Quote
Anyway, for this reason (and because we need to start somewhere), we're just going to commit to doing mouse and human right now...

I agree that this is the best approach but i was thinking along the lines of crop plants, later. You are at Novartis, right? Or is that my mistake?

GNF is a research institute by the Novartis Research Foundation and separate from Novartis' internal pharamceutical research.  To my knowledge, Novartis is no longer in the agricultural business after having merged with Astra Zeneca's agribusiness several years back...   But I'm not sure if integrating with the plant genomics community would be easier or harder than other animal model organisms.  The plant-specific genes are easy -- they just become new gene entries.  The basic cellular machinery which is shared would be sticky for the same reason as before -- how does one describe the true orthologs?
Logged

Chris Day
Forum Regular
*****
Posts: 1068



« Reply #10 on: April 03, 2007, 14:56:58 UTC »

how does one describe the true orthologs?

Rignt, and its worse in plants due to the frequent polyploidy events.
Logged

Larry Sanger
Founding Editor-in-Chief
Forum Regular
*****
Posts: 1830



WWW
« Reply #11 on: April 04, 2007, 02:25:13 UTC »

I see no good reason not to do this, except that the license might be problematic, from the sounds of it.  For a good while I've been leaning toward CC-by-sa-nc, but I came from a position where we'd use the GFDL for everything.  I'm now leaning back toward the latter position.  The advantages and disadvantages are all hard to weigh all at once, but I'm confident we'll make a wise decision.  Bear in mind that, even if we did decide to go with CC-by-sa-nc, you could export just the articles your b*t Smiley creates, because you'd be sharing the copyright with CZ.

I think that a project plan needs to be worked out and examined in some detail, however.  Is it necessary to make the articles editable at all?  Is the idea that a bot would create short, standardized articles based on shared data, which human beings would then add to?  Isn't this a bit problematic in that the bot would be able to run only once, since a second edition would automatically overwrite whatever human beings added?  What's the plan to deal with this problem, anyway?

Furthermore, there is the problem whether there are enough people available, even in the long run, to transform your bot-generated "stubs" into "encyclopedia articles."  Are there so many geneticists in the world that we can expect them to have interesting things to say about 10K genes on CZ?  If not, perhaps we shouldn't make any of the pages created by the bots editable at all (i.e., protect them all).  If someone wanted to write about a particular important gene, he would have to make a separate article.  And in that case, the bot-generated gene reference work could live in a separate namespace (think [[Gene:IL2-inducible_T-cell_kinase]]).

There's also one disadvantage of the plan, which is that "random page" would become almost useless--after your bot ran, most of the articles in the database would be gene articles.  But this is also just a temporary inconvenience.  If it adds 10K articles, that will dilute the database for only a year or two.  A possibility is that we could exclude the bot articles from "random page" somehow.

Maybe you can write up a project plan, answering these and other obvious questions, that we can examine?
« Last Edit: April 04, 2007, 02:34:05 UTC by Larry Sanger » Logged

My CZ user page: http://en.citizendium.org/wiki/User:Larry_Sanger
Please link to your CZ user page in your signature, too!
To do that, click on Profile > Forum Profile Information.
Andrew Su
Forum Member
**
Posts: 11


« Reply #12 on: April 04, 2007, 05:12:00 UTC »

Bear in mind that, even if we did decide to go with CC-by-sa-nc, you could export just the articles your b*t Smiley creates, because you'd be sharing the copyright with CZ.

Yeah, but I think we'd only be able to export the part of the article we contributed, and not take advantage of the community's efforts to enhance the article.  Anyway, I'll be very interested to see how this pans out.  If it goes CC-by-sa-nc, then probably I'll try to use the same bot on both CZ and WP, but the emphasis on debugging and customization would be on WP... 

I think that a project plan needs to be worked out and examined in some detail, however.  Is it necessary to make the articles editable at all?  Is the idea that a bot would create short, standardized articles based on shared data, which human beings would then add to?  Isn't this a bit problematic in that the bot would be able to run only once, since a second edition would automatically overwrite whatever human beings added?  What's the plan to deal with this problem, anyway?

I plan to limit the bot edits to a specific infobox.  The example page (http://en.wikipedia.org/w/index.php?title=IL2-inducible_T-cell_kinase) has evolved quite a bit since I first posted it and I think it's pretty close to a final v1.0 target.  All the edits that the bot makes will confined to that infobox, so it won't write over most contributions to the unstructured free-text section.  I plan on putting a comment in the infobox that changes will be overwritten by bot updates, and also post instructions on how to add a comment that will tell the bot to skip the update.

Furthermore, there is the problem whether there are enough people available, even in the long run, to transform your bot-generated "stubs" into "encyclopedia articles."  Are there so many geneticists in the world that we can expect them to have interesting things to say about 10K genes on CZ?  If not, perhaps we shouldn't make any of the pages created by the bots editable at all (i.e., protect them all).  If someone wanted to write about a particular important gene, he would have to make a separate article.  And in that case, the bot-generated gene reference work could live in a separate namespace (think [[Gene:IL2-inducible_T-cell_kinase]]).

I would strongly advocate not protecting the articles or putting them in a separate namespace.  I believe there's a huge community of geneticists and molecular biologists with whom this would catch on.  First, researchers write peer-reviewed articles all the time on a particular protein family, but it's a select few that get invited to write these review articles or have the patience to do it all.  Second, I plan on putting in reciprocal links between the infobox and the web application that caused us to collate all the data in the first place (http://symatlas.gnf.org/SymAtlas).  We get ~40K hits and ~3K users per week, so I hope to lead this crowd to the CZ/WP efforts.  Anyway, like most things on CZ/WP, I think this effort would only get bigger with time.  My hope is that this will nucleate some sort of critical mass...

Maybe you can write up a project plan, answering these and other obvious questions, that we can examine?

I just put together an initial set of specs for the bot at http://en.wikipedia.org/wiki/User:ProteinBoxBot.  This info will eventually be used for the WP bot approval process.  The discussion above noted that there is no official CZ bot policy.  I guess CZ is small enough (now) such that perhaps we wouldn't need one to proceed, as long as it's clear what the game plan is...  Anyway, happy to provide more details or answer more questions. 
Logged

Jason "Electrawn" Potkanski
Forum Participant
***
Posts: 158


I eat vandals like treats.


« Reply #13 on: April 04, 2007, 05:45:28 UTC »

To counterbalance this, the first bot we allow is to import the articles from 1911 Britannica. The article texts are in the public domain and will always be. It should be easy to find the Public Domain version that was uploaded to wikipedia or via other websites. Legal precedent says that a copy of a work that is in the public domain is in the public domain. Like say a picture of a page in the 1911 Britannica. That also makes http://www.1911encyclopedia.org/ fair game, contrary to whatever the policy in their disclaimers and terms attempts to limit.

-Jason Potkanski
Logged
Andrew Su
Forum Member
**
Posts: 11


« Reply #14 on: April 04, 2007, 18:02:48 UTC »

To counterbalance this, the first bot we allow is to import the articles from 1911 Britannica. The article texts are in the public domain and will always be. It should be easy to find the Public Domain version that was uploaded to wikipedia or via other websites. Legal precedent says that a copy of a work that is in the public domain is in the public domain. Like say a picture of a page in the 1911 Britannica. That also makes http://www.1911encyclopedia.org/ fair game, contrary to whatever the policy in their disclaimers and terms attempts to limit.

Jason, want to be sure I understand the relevance of the example above.  Do you mean to point out that since my proposed bot is gathering things from the public domain, then there will be no restrictions on how GNF (as a commercial entity) can use that data?  If so, then it's a point well taken, but my primary concern is the unstructured "free-text" content that comes in *after* we seed these protein stubs.  For people who first contribute to CZ, under CC-by-sa-nc we would not be able to put that content on our site.  And if that's the case, then there is no incentive (and really a disincentive) for me to steer our SymAtlas community to CZ.  Better to link our site to WP where we will be able to incorporate their content into our portal (and CZ of course could do the same).  But it would not work in reverse.  For most people this distinction may not be relevant, but for GNF it is...
Logged

David Goodman
Forum Participant
***
Posts: 247


« Reply #15 on: April 19, 2007, 05:07:19 UTC »

All of this is a clear illustration of the problems we will encounter by having the nc restriction. the use of CZ will be increased both in nature and amount by the least restrictive
 possible licensing.
Logged

David E. Volk
Non-Citizen
***
Posts: 330


David Volk at Stingaree


WWW
« Reply #16 on: April 22, 2007, 10:15:40 UTC »

I see several problems with this idea.  First the overwrite of human-added text previously mentioned.  Second, there is nearly no information for most of the (human)  genes.  Although there are a lot of geneticists, they tend, I think, to mostly work on established genes, paying particular attention to oncogenes, for example.  Because of our current lack of knowledge in a systems biology sense, much of the gene ontology is likely to be wrong and will need to be updated, getting back to first point again.  Also the problem with multiple reading frames for some genes.
 How about a thought experiment: make a bot to extract everyword out of a good dictionary for each language on earth, make stubs for all of them and see how many people volunteer to upgrade all of the stubs and inter-relate them?    I did like the example gene page, but only because a lot of information is available for the particular gene that was selected, including a 3D structure of the protein.  The final point is what kind of label will you put on a open reading frame with no known function and no name other than ORF #####?  How does that help the layman?
Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #17 on: April 24, 2007, 21:20:37 UTC »

Hi David,

I see several problems with this idea.  First the overwrite of human-added text previously mentioned. 

Did you also have a problem with the proposed response?  http://forum.citizendium.org/index.php/topic,697.msg5853.html#msg5853

Second, there is nearly no information for most of the (human)  genes.  Although there are a lot of geneticists, they tend, I think, to mostly work on established genes, paying particular attention to oncogenes, for example.  Because of our current lack of knowledge in a systems biology sense, much of the gene ontology is likely to be wrong and will need to be updated, getting back to first point again.  Also the problem with multiple reading frames for some genes.

Agreed that many genes are virtually unannotated, which is why I proposed a rough guessitmate of 10k gene pages (as opposed to the full set of ~25k mammalian genes).  I disagree, however, that "much of the gene ontology is likely to be wrong".  GO annotation is both incomplete and imprecise (i.e., very general annotation, like "cellular metabolism"), but I'd hesitate to say inaccurate.  Do you have any specific examples or studies to support this?  Regardless, even incomplete and imprecise annotation will require updates, and I hope the proposal in the previous post addresses those concerns. 

Also agreed that most genes have multiple possible protein products, but like most of the gene databases available now (and much of the literature), I propose we start with this gene-level of abstraction.

How about a thought experiment: make a bot to extract everyword out of a good dictionary for each language on earth, make stubs for all of them and see how many people volunteer to upgrade all of the stubs and inter-relate them?   

If someone proposed a similar thought experiment five years ago replacing "every word out of a good dictionary" with "every entry out of a good encyclopedia", I might have been skeptical too.  But I am continually surprised and impressed by people's desire to share knowledge.  And molecular biology in particular I think is ready for this idea.  Of course, reasonable people can disagree on that, which is why we're very open to suggestions on how to improve the idea.  But personally, I'm in too deep, so not doing it is not an option Wink (so long it doesn't harm the larger WP/CZ efforts). 

I did like the example gene page, but only because a lot of information is available for the particular gene that was selected, including a 3D structure of the protein.  The final point is what kind of label will you put on a open reading frame with no known function and no name other than ORF #####?  How does that help the layman?

Agreed, let's start with genes (~10k?) that have at least a framework to hang free-text annotation off of, and the richness of these stubs will vary widely.  In the upcoming pilot experiment for ~10 genes, we'll get a range in degree of annotation. 

(And of course, I agree with David Goodman's thoughts on using the least restrictive license possible.)

Cheers,
-andrew
Logged

David E. Volk
Non-Citizen
***
Posts: 330


David Volk at Stingaree


WWW
« Reply #18 on: April 30, 2007, 21:08:42 UTC »

Hi Andrew,  Grin

I just now came across your reply.  I thought I was set up to receive these sorts of things in my email, but apparently not.  If you or anyone else can help me with that, please do reply.

I am all for more knowledge, so don't let me discourage you.  I am just pointing out possible things to consider.

Did you happen to read about the Macaque genome in
Science last week?  One of the interesting things in it is that sometimes "disease" gene variants in humans are actually the "normal" genes in
Macaque, and that the Macaque variant is the ancestral gene more closely resembling everything else. 

At present, I have no explicit references for you, nor the desire to look them up, but it often happens (in seminars, symposia) that proteins with have multiple names, because different people working on say, vole foot fungus and  human blindness, will declare different functions for the same protein (gene product).  These often come out of gene knockout studies.  But since the cell works through a highly concerted, inter-related network of proteins with redundancies build in for many vital functions, the results of gene knockout studies can be very misleading, but not necessarily so.  The gene ontology question is just something that will work itself out over the next ten years or so.

Good luck with the project.





Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #19 on: August 13, 2007, 23:57:56 UTC »

Time to resurrect this dormant thread... 

We just got done with a trial run of our bot over on Wikipedia.  We took the 33 most cited genes (according to entrez gene and pubmed).  Of these, 8 had no existing WP pages when searching for the gene symbol, name, or aliases.  We created stubs for these genes from data in public domain databases.

http://en.wikipedia.org/wiki/MMP9    
http://en.wikipedia.org/wiki/HIF1A    
http://en.wikipedia.org/wiki/PTGS2    
http://en.wikipedia.org/wiki/NFKB1    
http://en.wikipedia.org/wiki/TGFB1
http://en.wikipedia.org/wiki/PPARG    
http://en.wikipedia.org/wiki/AKT1    
http://en.wikipedia.org/wiki/MAPK1

In addition, for the remaining 25 genes for which WP pages did previously exist, we could certainly update these pages with our new gene infobox in a semi-automated way.  For illustration, I've updated two of them:

http://en.wikipedia.org/wiki/Apolipoprotein_E
http://en.wikipedia.org/wiki/Amyloid_precursor_protein

I expect that we'll get full bot approval at WP in the next couple of weeks, and start generating these stubs and enhancing pages within a month.  Seems like this is a good time to revisit whether a parallel effort at CZ would be desirable.  If yes, then the two issues that I see need to be resolved:

  • CZ license:  this has been previously discussed, but to summarize briefly... I work at a for-profit company, and one part of what my small group does is host a free and public web site (http://symatlas.gnf.org) for gene annotation and expression data.  We'd like to incorporate any user-contributed content on WP/CZ back into our SymAtlas portal, but if any of the noncommercial licenses apply to these gene stubs, that'd pretty much be a deal-breaker here.
  • Bot policy: still don't see any sort of bot policy on CZ.  We'd need to test if our WP bot would work on CZ.  Also, there was previous talk of running bots from CZ servers.  We'd need to figure out that arrangement.

Feedback welcome.  Cheers,
-andrew
Logged

Andrew Su
Forum Member
**
Posts: 11


« Reply #20 on: August 21, 2007, 19:33:05 UTC »

FYI, I've added a sample cluster at [[APP]] (http://en.citizendium.org/wiki/APP).  comments welcome...

-andrew
Logged

Stephen Ewen
Guest
« Reply #21 on: September 05, 2007, 02:00:08 UTC »

To counterbalance this, the first bot we allow is to import the articles from 1911 Britannica. The article texts are in the public domain and will always be. It should be easy to find the Public Domain version that was uploaded to wikipedia or via other websites. Legal precedent says that a copy of a work that is in the public domain is in the public domain. Like say a picture of a page in the 1911 Britannica. That also makes http://www.1911encyclopedia.org/ fair game, contrary to whatever the policy in their disclaimers and terms attempts to limit. 

-Jason Potkanski

I'd urge serious caution here. 

In my understanding, while you can use the 1911 EB anyway you want, that does not mean you may obtain it anywhere you want.  It is not as simple as matters only related to copyright or lack thereof.

The "terms of use" at www.1911encyclopedia.org are essentially a contract between any user of the site and LoveToKnow Corp. Inc., the provider of the site, and they could file suit on that basis.

This part of the terms should especially cause serious pause: "You may not access our networks, computers, or Contents in any manner that could damage, disable, overburden, or impair them", which is clearly possible by the activity of a b*t. 

By merely entering www.1911encyclopedia.org, users are bound by whatever parts of its terms of use, its contract, that will stand up in court. 

  • CZ license: this has been previously discussed, but to summarize briefly... I work at a for-profit company, and one part of what my small group does is host a free and public web site (http://symatlas.gnf.org) for gene annotation and expression data.  We'd like to incorporate any user-contributed content on WP/CZ back into our SymAtlas portal, but if any of the noncommercial licenses apply to these gene stubs, that'd pretty much be a deal-breaker here.

Feedback welcome.  Cheers,
-andrew

All Creative Commons non-commercial licenses allow the copyright holder to give permission to use the work commercially. I'd think declining you that would be the actual deal-breaker, and that such a decline would be simply unthinkable.
« Last Edit: September 08, 2007, 18:24:19 UTC by Stephen Ewen » Logged
Chris Day
Forum Regular
*****
Posts: 1068



« Reply #22 on: January 15, 2008, 18:44:38 UTC »


Just an update. This project is still going ahead on wikipedia.  I saw this following blog recently.

http://mndoci.com/blog/2008/01/13/the-genes-wiki-project/
Logged

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.15 | SMF © 2011, Simple Machines Valid XHTML 1.0! Valid CSS!