This lab assignment requires you to write a few programs. First, run update21. This will create the cs21/labs/03 directory and copy over any starting-point files for your programs. Next, move into your cs21/labs/03 directory and begin working on the Python programs for this lab. The pwd command helps you verify that you are in the correct sub-directory.
$ update21 $ cd cs21/labs/03 $ pwd /home/your_user_name/cs21/labs/03We will only grade files submitted by handin21 in this directory, so make sure your programs are in this directory!
As you write your programs, use good programming practices:
Write a program named piglatin.py that converts a word typed in by the user into Pig Latin. The rules of Pig Latin that we will use in this lab are as follows:
A few runs of your program might look like this:
$ python piglatin.py This program converts a word you enter into Pig Latin. Enter a word: swarthmore Translated to Pig Latin: warthmoresay $ python piglatin.py This program converts a word you enter into Pig Latin. Enter a word: igloo Translated to Pig Latin: iglooyay
A common first step when performing a Google search is to convert a word into its root form to increase recall of relevant information. For example, Google may convert the words run, running, ran, runs into the root run. This process is also called stemming.
Write a program rootword.py that converts a word to its root form. This can be very complicated, so we'll just handle some common steps for finding root forms. Your conversion should accomplish the following:
$ python rootword.py This program converts your search term into its root word Enter word: denies The root word is deny $ python rootword.py This program converts your search term into its root word Enter word: distraction The root word is distract $ python rootword.py This program converts your search term into its root word Enter word: antihero The root word is hero
If a word has multiple prefixes or suffixes, it's okay if you only take off one. We will not test on words that meet multiple of the rules above (e.g., hemispheres). Speaking of which, we realize that this program is not perfect. There are many words it will fail to properly stem. For example:
But your program should work for words like these (in addition to the ones shown in the sample runs above):
A pair of characters in a string is a repeat if the two characters are consecutive and the same character. Write a program called countrepeats.py that asks the user for an input string and then counts how many repeats the string contains. Here are some examples:
$ python countrepeats.py This program counts the number of repeat characters in a string Enter string: I like puppies and kittens. There are 2 repeats in that string. $ python countrepeats.py This program counts the number of repeat characters in a string Enter string: yellow balloons oh my!! There are 4 repeats in that string. $ python countrepeats.py This program counts the number of repeat characters in a string Enter string: repeat characters are lame There are 0 repeats in that string.
If the input string has a stretch of more than two consecutive, identical characters, each pair counts as a repeat. For example:
$ python countrepeats.py This program counts the number of repeat characters in a string Enter string: Wooohooo!!! There are 6 repeats in that string.
The central dogma of biology states that the blueprint for all proteins is contained in an organism's DNA sequence. Specifically, DNA transcribes an intermediate molecule, RNA, which is then translated to a protein sequence.
Write a program called translate.py that converts a randomly-generated RNA sequence into a protein. Your program should utilize the following tips:
from genetics import *
>>> from genetics import * >>> translateName("AUG") 'Met' >>> translateName("GCC") 'Ala'
Here are three full examples:
$ python translate.py The RNA Strand is UCCAUGUAAUGUAUACCUGCCUAAAAC The final protein sequence is SerMetSTOPCysIleProAlaSTOPAsn $ python translate.py The RNA Strand is CACGCAGGCAUGCAGUAAAGUUCUACAUGAGCC The final protein sequence is HisAlaGlyMetGlnSTOPSerSerThrSTOPAla $ python translate.py The RNA Strand is ACUAUGACAUGCCAACGCUAGCGUCCU The final protein sequence is ThrMetThrCysGlnArgSTOPArgPro
NOTE: when testing your code, you can manually set the RNA strand to the same strand every time by supplying a number from 1-5 to getRNAStrand(). For example:
Once you have finished testing your code, make sure the final version
of your program calls getRNAStrand()
(without a number) to generate a random strand.
This does not affect your grade so please only attempt this after
completing the rest of your lab.
For those with more biology background, you may recall that translation
does not start until we see the sequence AUG and must stop
when it hits a stop codon UAG, UAA, or UGA.
Add conditions to your solution to produce proteins that meet this definition.
$ python translate.py The RNA Strand is CGCAUGUCGAGCCUAGUGCAGUAGUCU The final protein sequence is MetSerSerLeuValGln $ python translate.py The RNA Strand is GCACGAAUGCUGUCAAAAUAACCACUCCGG The final protein sequence is MetLeuSerLys $ python translate.py The RNA Strand is UCAAUGGGAUCAAAGCACUAGAUC The final protein sequence is MetGlySerLysHis
Remember you may run handin21 as many times as you like. Each time you run it new versions of your files will be submitted. Running handin21 after you finish a program, after any major changes are made, and at the end of the day (before you log out) is a good habit to get into.