Contents:
Project Details
Useful Functions and Links to more Resources
Submission and Demo
Compiling pthreads program: you need to include -pthread to the linking phase of compilation when you use the pthreads library. If you are using the makefile I gave you with the starting point code, add this to the LIBS variable:
LIBS = $(LIBDIRS) -pthreadIf you aren't using my makefile, include -pthread at the end of the gcc or g++ command line in your makefile.
I have a pthreads program with a makefile you can try out: /home/newhall/public/cs87/pthreads_example/
A server thread should only exit if the client closes its end of the
socket or if there are too many server threads that have been spawned
(you do not want to allow your server to use up all the socket resources
on a node in response to client connect requests). Your server should
not deny the new connection, but instead should "kill" server threads
from older client connections. There are many ways to detect when there
are too many threads and kill some. My suggestion for "killing" threads
is to have server threads call select on their socket with a
timeout value (5 or 10 seconds, for example). If the thread wakes up
because of a timeout, it should check to see if there are too many active
server threads. If not, it should just call
This solution is not ideal since threads keep waking up when there is not real work to do (busy waiting) as opposed to only waking up when the client sends it a request on the socket or when it should die. It also can result in times when there are more than the max number of server threads in the system, as the threads that should die have not yet woken up and died. However, it is fine to use this approach.
You are welcome to try using signals to notify a thread that it should die. Signals would remove the busy waiting and may result in fewer instances when the number of active threads is above the max limit. One tricky part about using signals is that if a thread is in the middle of handling a client request when it receive the signal, it should complete that request before dying.
Assume that your web cache can store at most WEB_CACHE_SIZE number of web pages. Don't worry about the fact that the pages are all different sizes; your cache does not have to be in units of bytes, but in units of web pages. I suggest setting this to a small size (~10) to make testing and experimentation easier.
Your webserver should keep track of web caching statistics: the number of pages found in the cache (# hits) and the total number of webpages accessed.
To simulate a cache miss, a server thread should sleep for some number of ms before responding to the client. This will simulate the extra time it takes to read in the webpage from disk on a cache miss. A real disk read might take 15ms, however, you may want to use a larger number of ms to ensure that you can measure differences between two different sequences of url requests. Also as you debug your web cache and replacement policy, you may want to make threads sleep for longer (a few seconds) so that you can "see" cache misses as they happen.
To implement LRU (least recently requested), you need to keep state about each cache entry's last access time. Instead of keeping real time values, a counter can be used as the "current time".
Cache state should only change when the webpage being requested is valid. If your server responds with an error code, there should be no changes to its cache.
wget may be the most useful way to experiement with your web cache. You can run wget to issue a set of url requests that are stored a file. I suggest running experiments with just a single wget client to ensure a more controlled experiment.
To run wget to issue a set of url requests:
# run this way with a small url_file for debugging # the webpages will be written to outfile and a log of wget # requests and responses will be written to logfile $ wget -v -i url_file -O outfile -o logfile # run this way for timed experiments with a bigger url_file: # the webpages and logfile get written to /dev/null so you won't # fill up your quota with large files $ time wget -v -i url_file -O /dev/null -o /dev/nullAn example url_file is the following:
http://www.cs.swarthmore.edu/ http://www.cs.swarthmore.edu/~newhall/ http://www.cs.swarthmore.edu/~richardw/ http://www.cs.swarthmore.edu/~blah/You will submit a short written summary of your experiments with your lab 3 solution.
int pthread_create(pthread_t *restrict thread, const pthread_attr_t *restrict attr, void *(*start_routine)(void*), void *restrict arg);The third argument is a function pointer. You should pass it the name of the function that each thread will start executing. The prototype of the function you pass it must look something like this:
void *my_thread_main(void *args) ;This is C-style generic function: it takes a pointer to anything (void*) and returns a pointer to anything (void*). You will need to pass your thread main function its socket. Function arguments are passed as the last parameter to pthread_create. Inside your thread main function just re-cast args to an int to get the socket:
int socket_fd; socket_fd = (int)(args);If you want to pass more than one value to your main thread program, then you will need to define a new struct type, fill in the fields of the struct, one for each "parameter" and pass the struct as the void * arg. Again, just re-cast to the right type inside your function.
Here is what a call to pthread_create might look like if fd is the file descriptor returned by accept and tid is a variable of type pthread_t:
ret = pthread_create(&tid, 0, my_thread_main, (void *)fd);There is a man page for pthreads, and for the individual functions in the pthreads library.
watch -n 1 cat /proc/net/tcp