Documentation: Science

From CompBio
Revision as of 03:26, 14 October 2013 by Ram (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Group meetings

Group meetings are generally held Thursdays from 2-4p unless I am travelling or can't make it for other reasons. You're always welcome to have a group meeting even when I am not around and I encourage it. Generally the idea is that each person will either present formally or informally what they did from the time of their latest presentation.

A group social/dinner/happy hour follows after the group meeting which in the past has lasted until 7a the next morning.

Weekly reports

In addition to attending the group meetings and making a formal or informal presentation, every member of the group is expected to submit a weekly report detailing their progress to the [archived group mailing list (; see below for details.) These messages are archived in Google Groups and you can log in there to read past messages and search through them. It is expected that they will be archived forever, so use your judgement wisely on what you write.

Weekly reports are important for your career development as an independent scientist, to maintain and motivate collaborations within the group, and to obtain feedback on your progress. Weekly reports should reflect research done within the group, but can also elaborate on whatever else is going on in your life to foster other types of connections. For some of us this has included everything from social life to coursework and work done in other research rotations (from being a PI to being a graduate student to marriages and births of offspring). Reports are due by group meeting every week (2pm Thursdays), but if you are late, please still send one along. Better late than never.

Group archives

Mailing list archive

In addition to archiving weekly reports, the archived mailing list can also be used for any scientific discussions that you feel should be archived for posterity's sake. You can access previous posts at . This is a good idea to learn how things work and people who've done this have reported positive results.

A good way to present your results is to connect them to your weekly presentations and data generated using the file archive below. However, since there's an inertial barrier for me (or anyone else) to go over these files, it would help to summarise any major conclusions/outcomes/etc. obtained (rather than just stating what you did).

File (group) archive

ALL the presentations and data from our group (talks at group meetings, PDFs sent to me, any results, posters, talks at conferences, abstract submissions, paper submissions, grants, RMSD vs. score plots, WHATEVER) and anything you wish me to see should be put in our group archive and pointers should be given to this data. I deem this necessary since whenever I write a grant or need to look up something, the downtime is too long for me to wait for a response. Plus it's good to have a historical record. Consider this the equivalent of a bench person's laboratory note book, which I think is an excellent practice that computational scientists should get into doing.

The presentations are STRICTLY organised by subdirectories containing the date of the presentation with the form: mmmddyyyy. For example, mar012006. Within that subdirectory, you can place anything you want for that date.

So normally, you'd issue the following commands (feel free to write a little script for this if you'd like):

mkdir ~archive/<username>/presentations/mmmddyyy/ cp <files> ~archive/<username>/presentations/mmmddyyy/

For example:

mkdir ~archive/mcdermottj/presentations/feb242006/ cp jobtalk.ppt ~archive/mcdermottj/presentations/feb242006/

Please make this a regular habit and don't make me remind you. Please also use the date convention EXACTLY' and DO NOT create directories and place files outside of this directory structure! I will be checking this. Finally, when you need to update a presentation, use a "NEW date stamp and don't overwrite the old one.

Attachments and formats

Please avoid sending me attachments in email, especially those involving MS software! You can either include the proper ASCII translation in the email (see below) put stuff up in the presentations directory or point me to files in your home directories (ideal). I suggest that you become comfortable to access and deposit files on your Linux account in our network. Also, I prefer ASCII over any other format (most of which I convert to ASCII anyway). PDF documents can be coverted to ASCII using pdftotext, and MS Word documents can be converted to text using antiword (RTFM or install it if you've not already). For most purposes, this is adequate. This is the FASTEST way I'll get to see anything you send me and receive a response (and I'm normally slow)!

Please note that as of this writing, UW students and staff get a 1GB quota of disk space for files of all services offered by UW Technolgy, including web publishing. Sending a link to large files is a "friendly" way to share files with everyone, and avoids encountering odd limits mailing lists may have on attachment sizes.

Intellectual property policy

Generally any work of authorship involving copyrights, everything authored by group members is technically in the public domain, but it is achieved in a unique way. If you write a piece of software, you may put your name as an author, but there can't be any copyright statements of any sort since such statements are not necessary. Copyrights generally vest upon creation of a work of authorship under copyright law. So there is no need to put any copyright or licensing statements if you wish to do nothing other than acquire the legal copyrights available. If a work you're creating is a derivative of another work that is licensed under a traditional copyleft or "viral" license (such as the GNU General Public License (GPL)), then you're generally obligated to release your work under such a license also. Otherwise if you create an original work of authorship that is not a work-for-hire (defined under agency law) and do nothing else, then all relevant copyrights vest with you (note the stealth approach possible here). If you make a statement to the effect that a given work is released/placed/put into the public domain, it means that the work you created doesn't belong to anyone (or belongs to the public) and they can do whatever they wish with it (even claim it is theirs). The advantage of either approach is that if you leave, you can "take" the code with you, and any modifications you make will be governed by whatever future obligations you may have (if any). An example draft agreement is below (if you don't sign it, I assume you have agreed to it).

If we decide to commercialise anything we create, or really want to proprietarise anything (and that is a mutual choice), then it will be through patents (methodological inventions either constructively or actually reduced in practice). IF we do decide to do this, inventorship will be based on the respective contributions (typically equally shared, which is the UW default policy) and the UW will own the rights to the patent. ALL inventors (regardless of academic hierarchy, i.e., preundergraduate, undergraduate, graduate, postgraduate, faculty, etc.) will be treated fairly no matter their level of contribution and decisions will not be made unilaterally. If you feel I err in making these decisions (as PI), then please be proactive and let me know. Please note that as PI, I am the interface between the group and the UW Office of Technology Transfer (TechTransfer) and I normally argue on the behalf of all inventors (sometimes when they are present, and sometimes not). TechTransfer staff are paid to do our IP-related work, and they generally hire people with years of experience to do this. So unless you really want a learning experience, the general view is to provide them with basic background and let them sort out the claims and language of the patent (which can then be reviewed by us).

Philosophical notes

I am against the concept of intellectual property (copyrights and patents, primarily; I don't view trade marks, trade secrets, and other forms in the same manner). See my primer on the ethics of intellectual property and the influence I have had (since 1994) in the free (unrestricted, not zero price) copying of digital media.

However, software or music are creations that I believe can be produced for little cost and recouped through the viral distribution that is enabled by free copying. Since our therapeutic discovery work has taken off, I have found it hard, if not difficult, to independently support the development of therapeutics to a stage (i.e., through clinical trials, etc.) that will reach the average person, especially since my interest is in third world diseases. To this end, for now, I will play the "game" to secure money using the limited monopolies that are patent laws for first world diseases which can then be channelled to treat third world diseases.

In the long term (10-100 years), if our technological and business models are successful, I believe that therapeutic discovery, reproduction, and distribution can be achieved in the same way software or music reproduction can be. By this I don't mean we can transmit drugs over the Internet, but that we can create "copiers" where a drug design is input and a physical drug is output instantly on demand--personalised and preventative medicine--based on the genotype of the host (others are working on solving this problem of producing genotypes on demand and for cheap, a problem which I believe will be solved in the next 20-30 years). The diagnoses and dispensing of therapeutics will still be done by practioners of medicine (i.e., doctors, pharmacists, etc.) but the costs will be greatly reduced and there will be no monopoly based restrictions on therapeutics.

That is, I see current pharmaceutical companies going the same way as the current music and software industries. At the time of writing, Microsoft monopolies haven't been broken (which I think exist NOT because of their IP policies but because of their business acumen), but we do have several thousand companies distributing Linux/Unix/FreeBSD and even the Apple MaxOS X uses FreeBSD as its base operating system.


This section is near the end not because of its lack of importance, but because I can keep adding to it. I expect complete and total integrity, and the highest standard. Do not lie to yourself, first and foremost, and distrust everything you do (but be optimistic). We are trying to solve problems and designing protocols and systems where we are not introducing information about the correct answer, or the answer we want, should be carefully guarded against. Some important reading material include:

  • Pathological science by Irving Langmuir
  • Millikan's oil drop experiment

Web server maintenance

If you have a web server that makes predictions and sends models, please check on a routine basis (every week/month) to ensure:

  • Your predictions are actually getting e-mailed to people. This should be done by submitting a prediction and checking to make sure the results are e-mailed to an address outside of the domain (your address is fine). You can set up a script to do this.
  • Make sure your script works exactly and produces the right results. I do this currently by having a test set of proteins for which I know the answer and checking to make sure what is returned by the server is consistent with that answer.

This is really important since we might be being reviewed for a grant/paper at anytime and the last thing we want is a reviewer to be upset if our server isn't functioning properly.

Public addresses for web servers

The current major science programs for the group are CANDO, Protinfo, and Bioverse. The default URLs for these are:

Modules take on the form:


The default email addresses for CANDO, Protinfo, and Bioverse webservers are cando@compbio, bioverse@compbio and protinfo@compbio (which also correspond to admin@cando.compbio, admin@protinfo, and admin@bioverse, but I discourage use of these addresses). If a more specific address is desired, then the address <module>@<server> needs to be created by the sysadm.

Personal tools