CS46 — Homework 3


More Regular languages: Due Thursday 13 Feb at the start of class
Starting files are available from my public cs46 folder
cd ~/cs46/
cp -r examples/homework/wk3 homework/
cp -r examples/labs/wk3/ labs/
git status
git add homework/wk3 labs/wk3/
git commit -m "week3 start"
git push
You may work with a partner and submit a common solution to both the written and programming portion of this homework.
In lab practice
  1. Convert the following regexes to NFAs: $R_1=((a\cup b)^*(\varepsilon \cup c)^*)^*$ and $R_2=((ab)^* \cup (bc)^*) ab$.
  2. Convert the following NFAs to DFAs.
  3. Consider the regular expression $R=(ab \cup aab \cup aba)^*$. Design a simple NFA that accepts the language described by $R$ and convert this NFA to a DFA. Note: you probably don't want to use the union construction described in class as it will create an NFA with too many states.
Implement a NFA simulator
Implement a NFA simulator in the programming language of your choice.

Your code should accept two files as input. The first file is a description of the machine. The sample format for this file is shown below

qe 0
q0 0
q1 0
q2 0
q3 0
qf 1

qe E q0
q0 a q0
q0 b q0
q0 a q1
q0 b q2
q1 b qf
q2 b q3
q3 b qf
qf a qf
qf b qf
Each non empty line has either two or three strings separated by spaces. The lines with two strings represent states. The first string is the state name. The second string is either 0 or 1. A state is a final state if the second string is one. The first state listed is the start state (e.g., qe)

The lines with three strings represent the transition function. q0 a q0 is interpreted as "if the machine is in state q0 and reads an a, the machine may transition to state q0

You may assume that all states are listed before all transitions.

Use an E to indicate epsilon transitions, e.g., qe E q0

The second file is just a list of string that are inputs to the machine. See the example below.

ab
aabb
abb
baa
aaabbb
aabb
ababb
a
bb
aa
bbb
Your code should test each input string against the machine and label each string as accepted or not accepted.

Additionally, create a machine of your own and some test code to verify your program is correct.

$ python nfa_sol.py machine1.txt input1.txt 
ab: False
aabb: False
abb: True
baa: False
aaabbb: False
aabb: False
ababb: False
a: True
bb: False
aa: False
bbb: False

The solution below shows the intermediate state prior to reading the next symbol

$ python nfa_sol.py machine2.txt input1.txt 
['q0'] a
['q1', 'q0'] b
['q0', 'q2', 'qf'] no more input
ab: True
['q0'] a
['q1', 'q0'] a
['q1', 'q0'] b
['q0', 'q2', 'qf'] b
['q0', 'q3', 'q2', 'qf'] no more input
aabb: True
['q0'] a
['q1', 'q0'] b
['q0', 'q2', 'qf'] b
['q0', 'q3', 'q2', 'qf'] no more input
abb: True
['q0'] b
['q0', 'q2'] a
['q1', 'q0'] a
['q1', 'q0'] no more input
baa: False
['q0'] a
['q1', 'q0'] a
['q1', 'q0'] a
['q1', 'q0'] b
['q0', 'q2', 'qf'] b
['q0', 'q3', 'q2', 'qf'] b
['q0', 'q3', 'q2', 'qf'] no more input
aaabbb: True
['q0'] a
['q1', 'q0'] a
['q1', 'q0'] b
['q0', 'q2', 'qf'] b
['q0', 'q3', 'q2', 'qf'] no more input
aabb: True
['q0'] a
['q1', 'q0'] b
['q0', 'q2', 'qf'] a
['q1', 'q0', 'qf'] b
['q0', 'q2', 'qf'] b
['q0', 'q3', 'q2', 'qf'] no more input
ababb: True
['q0'] a
['q1', 'q0'] no more input
a: False
['q0'] b
['q0', 'q2'] b
['q0', 'q3', 'q2'] no more input
bb: False
['q0'] a
['q1', 'q0'] a
['q1', 'q0'] no more input
aa: False
['q0'] b
['q0', 'q2'] b
['q0', 'q3', 'q2'] b
['q0', 'q3', 'q2', 'qf'] no more input
bbb: True


$ python nfa_sol.py machine5.txt input1.txt 
['qe1', 'q0', 'qe2'] a
['q1'] b
['qf'] no more input
ab: True
['qe1', 'q0', 'qe2'] a
['q1'] a
[] no more states
aabb: False
['qe1', 'q0', 'qe2'] a
['q1'] b
['qf'] b
['qf'] no more input
abb: True
['qe1', 'q0', 'qe2'] b
[] no more states
baa: False
['qe1', 'q0', 'qe2'] a
['q1'] a
[] no more states
aaabbb: False
['qe1', 'q0', 'qe2'] a
['q1'] a
[] no more states
aabb: False
['qe1', 'q0', 'qe2'] a
['q1'] b
['qf'] a
['qf'] b
['qf'] b
['qf'] no more input
ababb: True
['qe1', 'q0', 'qe2'] a
['q1'] no more input
a: False
['qe1', 'q0', 'qe2'] b
[] no more states
bb: False
['qe1', 'q0', 'qe2'] a
['q1'] a
[] no more states
aa: False
['qe1', 'q0', 'qe2'] b
[] no more states
bbb: False

Think about what modifications you need to make to the DFA to make the simulation work for an NFA. Note in particular that each state could transition to zero, one, or more than one states when reading a single symbol. You will also need to handle epsilon closures, denoted $E(q)$ in the text and notes. Design your code in such a way to avoid infinite loops from trying to expand states that were previously expanded (machine5.txt has a potentially infinite epsilon loop)

Feel free to ask questions, but with the start code provided, you should have a basic idea of how to proceed

Remember to run git add, git commit, and git push to submit your assignments. If you are working with a partner, both you and your partner need to hand in the lab, but I will grade the most recent push. Include both names at the top of your program and your homework.