Assignment

MYSTERY MOLECULE PROJECT, PART II (20 points total)

BIO1001 Fall 2020

To complete the remainder of this project and prepare for your presentation, follow the instructions below.  Your presentation should include the answers to all questions indicated below.  For your final presentation, you are expected to organize your research and present a rehearsed, 10-12 minute presentation using the rubric at the end of this document as a guideline for preparation.

Learning Objectives

At the conclusion of this phase of the mystery molecule project, students will be able to 

1. Effectively navigate sequence databases of the NCBI website

​a. BLAST DNA sequence against the human genome

​b. Analyze alignment data from genomic databases

​c. Identify an unknown gene based on DNA sequence

2.  Use the scientific literature (primary and secondary resources) to analyze gene products, and research the cellular and molecular basis of a disease gene.  

3.  Organize scientific content into a coherent presentation.

4.  Communicate research results to a group of peers.

Step 1: BLAST your DNA sequence

1.  Go to:

http://www.ncbi.nlm.nih.gov/

2.  Click on BLAST (under “popular resources” at the right side of the screen)

3.  Choose the nucleotide blast program (under “Web BLAST”)

4.  In the “blastn” tab (on left) enter your DNA sequence into the “enter query sequence” box (obtain your mystery DNA sequences in the Mystery Molecule content area in D2L).  

​a.  In “choose search set” click the “genomic + transcript databases” 

​b.  From the drop down menu below, select “Human genomic + transcript (Human G + T)” 

            c.  Under “program selection” in the next section down, optimize for highly similar sequences (megablast)

5.  Submit query (click on “BLAST” at the bottom of the screen) and wait– this may take up to a few minutes.

6.  A colorized map of sequence alignment scores will appear.  Red indicates very good sequence alignment.  

7.  Scroll down to descriptions of sequences that produced statistically significant alignments.  You should see options for genomic alignments and transcripts.  An e-value close to zero, and maximum sequence identity close to 100% are optimal.  

8.  Identify your mystery gene/transcript from the description, e-value, and % sequence identity.

Step 2: Analysis of gene product

Click on the reference number (left, in the column labeled, “Accession”) for the mRNA transcript or genomic sequence of your mystery gene.  Scroll down; you will see many references (titles, authors, years of publication) describing your gene.  Clicking on any one of the PubMed reference numbers will link you directly to the original publication.  You will need these publications to characterize your gene for your presentation.  Bookmark this page—all your references must be from primary or secondary sources (no google, wikipedia, textbooks, blogs, etc.).  You must use a minimum of 7 references that can be found by searching NCBI/PubMed in the completion of your project-note that this may require you to utilize interlibrary loan, so plan ahead.

From these references, the rest of NCBI’s resources (PubMed, genes and disease area of NCBI, etc.), and the knowledge you gained from Part I of this project, you should be able to determine answers to the following questions:

About the Gene:

What is the name of your gene?

Where in the human genome is your gene located (i.e., on which chromosome)?

About the Disease:

What disease is caused by mutation(s) in your gene?

What are the symptoms and signs of the disease?

How prevalent is the disease in the human population (how many people are affected; is it more likely to affect people based on gender/race/nationality; etc.)?

What is the expected outcome for people with this disease?  Is it deadly?  If so, what is the mortality rate?  If not, what treatment is available to people affected by the disease?

What specific mutation(s) destroys the function of the gene product (NOTE:  there may be more than one)?

About the Gene Product (RNA and protein):

How long is the mRNA transcript (in bases or kilobases)?  Is there a difference between the wild-type and mutant forms?

How many exons are present in the transcript?  Is there a difference between the wild-type and mutant forms?

How many amino acids are present in the final, wild-type protein product?  

What change(s) in amino acids is(are) present in the mutant allele of this gene? How many amino acids are present in the mutant protein product?

About molecular mechanisms:

How does mutation affect the structure of your protein?  How do these structural changes influence the function of your protein (i.e., does the mutation affect an enzyme’s active site or allosteric site? Or perhaps a transmembrane domain?)

What cellular and molecular processes are affected by mutation(s) in your gene?  (NOTE:  you might want to think of this in terms of major cellular systems, like transport, metabolism, structural proteins, etc.)

Step 3: Preparing an oral presentation

Use the information from your research to prepare an oral presentation.  In addition to answering all of the questions from Step 2 above, your presentation should:

• Last from 10-12 minutes, including questions.• Make use of visual aids, such as through a Power Point presentation.• Be logically organized, clear, and engaging for the audience.

Additional details about the mechanics of the presentation are found in the evaluation rubric below.

Rubric for oral presentation:  20 points total

Content:

Emerging (0.25 point)

Developing (0.5 points)

Proficient (1 point)

Gene name

Not named, or only an abbreviation provided

Provides full name, alternate designations incomplete or name misspelled or name mispronounced

Provides full name, including alternate designations, spelling and pronunciation correct

Gene location

Not given, or identifies only the chromosome

Identifies the chromosome and arm (p or q)

Identifies the locus specifically

Disease associated with gene

Not named, or only provides abbreviation

Provides full name, alternate designations incomplete orname misspelled or name mispronounced

Provides full name, including alternate designations, spelling and pronunciation correct

Symptoms and signs

More than two not listed, ormore than two not described, orno distinction made between major/common and minor/rare ones

One or two signs and symptoms not listed, or one or two not described, or incomplete distinction made between major/common and minor/rare ones

Signs and symptoms listed and described, major/common signs and symptoms distinguished from minor/rare ones

Prevalence

Unclear presentation of number of people affected, orincomplete information on more than two demographic considerations

Clear presentation of number of people affected, but incomplete information on one or two demographic considerations

Clear presentation of number of people affected and demographic considerations (gender, race, nationality, geographic location, etc.)

Patient outcomes

Presentation of morbidity and/or mortality is underdeveloped or absent; discussion of standard of care absent

Clear presentation of morbidity and/or mortality associated with the disease, but incomplete treatment of patient care presented

Clear presentation of morbidity and/or mortality associated with the disease, as well as standard of care for those with the disease.

Specific mutations

Identification of mutation(s) omits specific DNA bases affected and exons/introns affected; or if multiple mutations are associated with the disease there is no distinction made between prevalent and rare mutations

Identification of mutation(s) omits specific DNA bases affected or exons/introns affected; or if multiple mutations are associated with the disease there is little distinction made between prevalent and rare mutations

Mutation(s) and their locations are specifically identified (DNA bases affected, exons or introns affected); if multiple mutations are associated with disease, the focus is on the most prevalent mutation(s) or group(s) of mutation(s)

mRNA transcript  

Length is not given, or is incorrect.

Length is given in bases or kilobases, but differences between wild-type and mutant transcripts are not identified

Length is presented in bases or kilobases, any differences between the wild-type and mutant transcripts are identified

Exons and introns

The number of exons and introns is either not given or incorrect.

The number of exons and introns is correctly presented, but their arrangement is unclear or differences between wild-type and mutant versions are not clearly presented

The number and arrangement of exons and introns is clearly presented, and any differences between the wild-type and mutant versions are clearly presented.

Protein product (wild-type)

Information is not included in the presentation

Number of amino acids in the wild-type protein is presented, but incorrect

Number of amino acids in the wild-type protein is presented and correct

Protein product (mutant)

Amino acid changes are not described, or are incorrect

Amino acid changes are described with less clarity (e.g., by noting only the mutant amino acid), or the severity of mutations isn’t ranked (if applicable)

Amino acid changes are clearly described (e.g., by noting the original amino acid as well as the mutant), and severity of mutations is ranked (if applicable)

Structure/Function relationships

Does not present information on protein structure, or does not present information on protein function

Presents only wild-type or mutant protein structure; orpresents only wild-type protein function or the effect of the mutation

Wild-type and mutant protein structures are clearly contrasted; wild-type protein function is clearly described; effect of mutation (loss of function, change of function, gain of function) is clearly described

Disease mechanisms

Explanation of the cellular basis for disease contains more than two errors, or does not connect the mutant protein to signs and symptoms of the disease

Explanation of the cellular basis for disease that contains up to two errors, or incompletely links the mutant protein to the signs and symptoms of disease

Thorough explanation of the cellular basis for disease that connects the aberrant function of the mutant protein to the signs and symptoms of the disease

Presentation structure and effectiveness:  30% of grade

Feature

Emerging

Developing

Proficient

Time Frame

Less than 9 minutes, or more than 13 minutes

Between 9-10 or 12-13 minutes

Between 10 and 12 minutes

Visual Aids

Absent, may add little to the presentation, encourages “reading” of the presentation, too much information per slide

Often but not always enhance presentation, clearly visible and easy-to-interpret 

Enhance presentation, with thoughts articulated, and keeps interest of the audience

Professionalism of Presentation

Thoughts not clearly articulated, poor posture and eye contact, does not engage audience; partners seem to be working independently of one another

Presentation is organized, information is clearly presented, and some level of audience engagement is observed; partners work together but the presentation isn’t fully integrated

Presentation is very well organized, given with energy, and the interest of the audience is maintained; clear partnership observed

Organization and analysis

Content is sometimes presented in a logical pattern, transitions are rough or absent, as is discussion.

Most content is presented in a logical pattern, transitions are less polished, some topics may lack discussion

Content is presented in a logical pattern with clear transitions and discussion

References

Less than 5 peer-reviewed publications included as references

5 or 6 peer-reviewed publications included as references

At least 7 peer-reviewed publications included as references