Contents:
Project Details
Useful Functions and Links to more Resources
Submission and Demo
Compiling pthreads program: you need to include -pthread to the linking phase of compilation when you use the pthreads library. If you are using the makefile I gave you with the starting point code, add this to the LIBS variable:
LIBS = $(LIBDIRS) -pthreadIf you aren't using my makefile, include -pthread at the end of the gcc or g++ command line in your makefile.
I have a pthreads program with a makefile you can try out: /home/newhall/public/cs87/pthreads_example/
A server thread should only exit if the client closes its end of the
socket or if there are too many server threads that have been spawned
(you do not want to allow your server to use up all the socket resources
on a node in response to client connect requests). Your server should
not deny the new connection, but instead should "kill" server threads
from older client connections. There are many ways to detect when there
are too many threads and kill some. My suggestion for "killing" threads
is to have server threads call select on their socket with a
timeout value (5 or 10 seconds, for example). If the thread wakes up
because of a timeout, it should check to see if there are too many active
server threads. If not, it should just call
This solution is not ideal since threads keep waking up when there is not
real work to do (busy waiting) as opposed to only waking up when the
client sends it a request on the socket or when it should die. It also
can result in times when there are more than the max number of server
threads in the system, as the threads that should die have not yet
woken up and died. However, it is fine to use this approach.
You are welcome to try using signals to notify a thread that it
should die. Signals would remove the busy waiting and may result in
fewer instances when the number of active threads is above the max limit.
One tricky part about using signals is that if a thread is in the middle
of handling a client request when it receive the signal, it should
complete that request before dying.
Assume that your web cache can store at most WEB_CACHE_SIZE number of web
pages. Don't worry about the fact that the pages are all different sizes;
your cache does not have to be in units of bytes, but in units of web pages.
I suggest setting this to a small size (~10) to make testing and
experimentation easier.
Your webserver should keep track of web caching statistics: the number
of pages found in the cache (# hits) and the total number of webpages
accessed.
To simulate a cache miss, a server thread should sleep for
some number of ms before responding to the client. This will
simulate the extra time it takes to read in the webpage from disk on
a cache miss. A real disk read might take 15ms, however, you may want
to use a larger number of ms to ensure that you can measure differences
between two different sequences of url requests. Also as you debug
your web cache and replacement policy, you may want to make threads sleep
for longer (a few seconds) so that you can "see" cache misses as they happen.
To implement LRU (least recently requested), you need to keep state about
each cache entry's last access time. Instead of keeping real time values,
a counter can be used as the "current time".
Cache state should only change when the webpage being requested is valid.
If your server responds with an error code, there should be no changes
to its cache.
wget may be the most useful way to experiement with your web cache.
You can run wget to issue a set of url requests that are stored a file.
I suggest running experiments with just a single wget client to
ensure a more controlled experiment.
To run wget to issue a set of url requests:
Here is what a call to pthread_create might look like if fd is the
file descriptor returned by accept and tid is a variable of type
pthread_t:
Web caching
You will add to your web server a small web cache that uses an
LRU cache replacement policy.
Shared State
To implement both multiple server threads and web caching, you will have
some shared global state to which threads' access will need to be synchronized.
The pthreads library contains pthread_mutex_t semaphore.
Experimentation
Once you have your new and improved webserver implemented, you should run
experiments on your server with different client request workloads
to test your replacement policy. You can time the total time of a
client request workload as one way of comparing different sequences
of requests that should result in different cache hit rates. Your
server should print out cache hit and access counts as well.
# run this way with a small url_file for debugging
# the webpages will be written to outfile and a log of wget
# requests and responses will be written to logfile
$ wget -v -i url_file -O outfile -o logfile
# run this way for timed experiments with a bigger url_file:
# the webpages and logfile get written to /dev/null so you won't
# fill up your quota with large files
$ time wget -v -i url_file -O /dev/null -o /dev/null
An example url_file is the following:
http://www.cs.swarthmore.edu/
http://www.cs.swarthmore.edu/~newhall/
http://www.cs.swarthmore.edu/~richardw/
http://www.cs.swarthmore.edu/~blah/
You will submit a short written summary of your experiments with your
lab 3 solution.
int pthread_create(pthread_t *restrict thread,
const pthread_attr_t *restrict attr,
void *(*start_routine)(void*),
void *restrict arg);
The third argument is a function pointer. You should pass it the name of
the function that each thread will start executing. The prototype of
the function you pass it must look something like this:
void *my_thread_main(void *args) ;
This is C-style generic function: it takes a pointer to anything (void*)
and returns a pointer to anything (void*). You will need to pass your
thread main function its socket. Function arguments are passed as the
last parameter to pthread_create. Inside your thread main function
just re-cast
args to an int to get the socket:
int socket_fd;
socket_fd = (int)(args);
If you want to pass more than one value to your main thread program, then
you will need to define a new struct type, fill in the fields of the struct,
one for each "parameter" and pass the struct as the void * arg. Again,
just re-cast to the right type inside your function.
ret = pthread_create(&tid, 0, my_thread_main, (void *)fd);
There is a man page for pthreads, and for the individual functions in
the pthreads library.
watch -n 1 cat /proc/net/tcp
Submit it by running cs87handin.
Demo
You and your partner will sign up for a 15 minute demo slot to demo your
web server. Think about, and practice, different scenarios to demonstrate
both correctness and good error handling. You will want to demonstrate
concurrent client connections, persistent connections, and that you have
correctly implemented LRU.