Proxy Lab

mindbergh/ProxyLab: CMU 15-213 Proxy Lab – GitHub

# CS:APP Proxy Lab
# Student Source Files
This directory contains the files you will need for the CS:APP Proxy
proxy. c
csapp. h
csapp. c
open_clientfd_r. c
These are starter files. csapp. c and csapp. h are described in
your textbook. open_clientfd_r. c is a thread-safe version of
open_clientfd() based on the getaddrinfo() system call.
You may make any changes you like to these files. And you may
create and handin any additional files you like.
Please use `’ or ” to generate
unused ports for your proxy or tiny server.
This is the makefile that builds the proxy program. Type “make”
to build your solution, or “make clean” followed by “make” for a
fresh build.
Type “make handin” to create the tarfile that you will be handing
in. You can modify it any way you like. Autolab will use your
Makefile to build your proxy from source.
Generates a random port for a particular user
usage:. /
Handy script that identifies an unused TCP port that you can use
for your proxy or tiny.
usage:. /
The autograder for Basic, Caching, and CachingConcurrency.
helper for the autograder.
Tiny Web server from the CS:APP text
CS:APP 3/e Proxy Lab - HackMD

CS:APP 3/e Proxy Lab – HackMD

CS:APP 3/e Proxy Lab – HackMD
# CS:APP 3/e Proxy Lab
contributed by < `type59ty` >
###### tags: `sysprog2018`
> [原始程式碼]()
## 事前準備
– Download the [handout]()
– Study the [write up]()
– Study CSAPP ch 10, 11, 12
## 參考資料
– [代理伺服器]()
– [區網控制者: Proxy 伺服器]()
– [CSAPP: Proxy lab]()
## Proxy 介紹
Web proxy 是一種 Web browser 跟 end server 之間的中介程式,使用者在瀏覽網頁時並不是直接連到 end server ,而是透過 proxy 接收 request , 由 proxy 將 request 發送給 end server , 再由 proxy 將 end server 的回應 (e. g 網頁內容) 傳送到使用者的 browser ,因此 proxy 同時扮演 client 和 server 兩種角色。
一些閘道器、路由器等網路裝置具備 proxy 功能。一般認為 proxy 服務有利於保障網路終端的隱私或安全,防止攻擊。! []()
## 作業要求
寫一個簡單的 HTTP proxy, 可以將 web 內容暫存。 此 lab 有3個部份要完成:
1. 設定 proxy 基本功能,接收 incoming connections,讀取並解析 request , forward requests 到 web servers,讀取 server 的 responses,最後將 responds forward 給對應的 clients。 此部份將學到基本的 HTTP 操作、了解如何運用 sockets 寫一個能在網路上溝通的程式。
2. 將 proxy 擴充,使其能夠同時處理多個連線。
3. 加入 cache 機制,用一個簡單的 main memory cache 記錄最近連線的網頁內容。
## Practice: echo server and client
根據課本 p. 663, 664,建立一個簡單的 client 和 server 程式,藉此熟悉這些 function 的操作。
第 15 行的 Open_clientfd 用來建立與 server 的連接
– echoclient. c
#include “csapp. h”
int main(int argc, char **argv) {
int clientfd;
char *host, *port, buf[MAXLINE];
rio_t rio;
if (argc! = 3) {
fprintf(stderr, “usage:%s \n”, argv[0]);
host = argv[1];
port = argv[2];
clientfd = Open_clientfd(host, port);
Rio_readinitb(&rio, clientfd);
while (Fgets(buf, MAXLINE, stdin)! = NULL) {
Rio_writen(clientfd, buf, strlen(buf));
Rio_readlineb(&rio, buf, MAXLINE);
Fputs(buf, stdout);}
– echoserver. c
void echo(int connfd);
int listenfd, connfd;
socklen_t clientlen;
struct sockaddr_storage clientaddr;
char client_hostname[MAXLINE], client_port[MAXLINE];
if (argc! = 2) {
fprintf(stderr, “usage:%s \n”, argv[0]);
listenfd = Open_listenfd(argv[1]);
while (1) {
clientlen = sizeof(struct sockaddr_storage);
connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);
Getnameinfo((SA *) &clientaddr, clientlen, client_hostname, MAXLINE,
client_port, MAXLINE, 0);
printf(“Connected to (%s, %s)\n”, client_hostname, client_port);
void echo(int connfd) {
size_t n;
char buf[MAXLINE];
Rio_readinitb(&rio, connfd);
while ((n=Rio_readlineb(&rio, buf, MAXLINE))! = 0) {
printf(“server received%d bytes\n”, (int)n);
Rio_writen(connfd, buf, n);}}
#### 用途:
先將 echoserver 打開,並指定一個 port,作為 client 要連接該 server 的 port
$. /echoserver 4000
再來從 client 端連接
$. /echoclient hostname 4000
server 端將會顯示
Connected to (localhost, 49492)
代表成功連接,然後就能從 client 端送出 request,此例會回傳 client 送出的字串
## Part I: Implementing a sequential web proxy
– 目標: 實做 sequential proxy 處理 HTTP/1. 0 GET requests! []()
參考 CSAPP p. 667 ,這邊先定義兩個功能:
1. HTTP request:
一個 **request line** ( line 5) 後面跟隨零個或多個 **request header** ( line 6) ,再跟隨一行 empty text line 來終止 header list ( line 7) 。
– request line 格式:
method URI version
– request header 格式:
header-name: header-data v
2. HTTP response:
和 HTTP request 相似, 它是由一個 **response line** ( line 8) ,後面跟隨零個或多個 **response header** ( line 9~13) ,再跟隨一行終止 header list 的 empty line ( line 14) ,再跟隨一個 response body ( line 15~17)。:::info
大部分的架構可參考 p. 671 TINY Web server:::
### Makefile
CC = gcc
CFLAGS = -g -Wall
LDFLAGS = -lpthread
all: proxy
csapp. o: csapp. c csapp. h
$(CC) $(CFLAGS) -c csapp. c
proxy. o: proxy. h
$(CC) $(CFLAGS) -c proxy. c
proxy: proxy. o csapp. o
$(CC) $(CFLAGS) proxy. o -o proxy $(LDFLAGS)
–exclude –exclude –exclude “. *”)
rm -f *~ *. o proxy core * * * * *
### main
int main(int argc, char **argv)
char hostname[MAXLINE], port[MAXLINE];
if(argc! = 2){
fprintf(stderr, “usage:%s \n”, argv[0]);
clientlen = sizeof(clientaddr);
/*print accepted message*/
Getnameinfo((SA*)&clientaddr, clientlen, hostname, MAXLINE, port, MAXLINE, 0);
printf(“Accepted connection from (%s%s). \n”, hostname, port);
/*sequential handle the client transaction*/
return 0;}
### doit
/*handle the client HTTP transaction*/
void doit(int connfd)
int end_serverfd;/*the end server file descriptor*/
char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
char endserver__header [MAXLINE];
/*store the request line arguments*/
char hostname[MAXLINE], path[MAXLINE];
int port;
rio_t rio, server_rio;/*rio is client’s rio, server_rio is endserver’s rio*/
sscanf(buf, “%s%s%s”, method, uri, version); /*read the client request line*/
if(strcasecmp(method, “GET”)){
printf(“Proxy does not implement the method”);
/*parse the uri to get hostname, file path, port*/
parse_uri(uri, hostname, path, &port);
/*build the header which will send to the end server*/
build__header(endserver__header, hostname, path, port, &rio);
/*connect to the end server*/
end_serverfd = connect_endServer(hostname, port, endserver__header);
if(end_serverfd<0){ printf("connection failed\n"); Rio_readinitb(&server_rio, end_serverfd); /*write the header to endserver*/ Rio_writen(end_serverfd, endserver__header, strlen(endserver__header)); /*receive message from end server and send to the client*/ while((n=Rio_readlineb(&server_rio, buf, MAXLINE))! =0) printf("proxy received%d bytes, then send\n", n); Rio_writen(connfd, buf, n);} Close(end_serverfd);} ### build__header void build__header(char *_header, char *hostname, char *path, int port, rio_t *client_rio) char buf[MAXLINE], request_hdr[MAXLINE], other_hdr[MAXLINE], host_hdr[MAXLINE]; /*request line*/ sprintf(request_hdr, requestlint_hdr_format, path); /*get other request header for client rio and change it */ while(Rio_readlineb(client_rio, buf, MAXLINE)>0)
if(strcmp(buf, endof_hdr)==0) break;/*EOF*/
if(! strncasecmp(buf, host_key, strlen(host_key)))/*Host:*/
strcpy(host_hdr, buf);
if(! strncasecmp(buf, connection_key, strlen(connection_key))
&&! strncasecmp(buf, proxy_connection_key, strlen(proxy_connection_key))
&&! strncasecmp(buf, user_agent_key, strlen(user_agent_key)))
strcat(other_hdr, buf);}}
sprintf(host_hdr, host_hdr_format, hostname);}
sprintf(_header, “%s%s%s%s%s%s%s”,
### connect_endServer
/*Connect to the end server*/
inline int connect_endServer(char *hostname, int port, char *_header){
char portStr[100];
sprintf(portStr, “%d”, port);
return Open_clientfd(hostname, portStr);}
### parse_uri
void parse_uri(char *uri, char *hostname, char *path, int *port)
*port = 80;
char* pos = strstr(uri, “//”);
pos = pos! =NULL? pos+2:uri;
char*pos2 = strstr(pos, “:”);
if(pos2! =NULL)
*pos2 = ‘\0’;
sscanf(pos, “%s”, hostname);
sscanf(pos2+1, “%d%s”, port, path);}
pos2 = strstr(pos, “/”);
*pos2 = ‘/’;
sscanf(pos2, “%s”, path);}
sscanf(pos, “%s”, hostname);}}
– 測試
$. /
*** Basic ***
Starting tiny on 24009
Starting proxy on 2533
basicScore: 40/40
*** Concurrency ***
concurrencyScore: 0/15
*** Cache ***
cacheScore: 0/15
totalScore: 40/70
## Part II: Dealing with multiple concurrent requests
原本的 sequential 版本一次只能處理一個 request , part II 的目的就是要加入 thread 的功能,使 proxy 可以一次處理多個 request
while(1) {
Pthread_create(&tid, NULL, thread, (void *)connfd);}}
### *thread
void *thread(void *vargp){
int connfd = (int)vargp;
## Part III: Caching web objects
加入 cache 機制,用一個簡單的 main memory cache 記錄最近連線的網頁內容。
### Cache structure
typedef struct {
char cache_obj[MAX_OBJECT_SIZE];
char cache_url[MAXLINE];
int LRU;
int isEmpty;
int readCnt; /*count of readers*/
sem_t wmutex; /*protects accesses to cache*/
sem_t rdcntmutex; /*protects accesses to readcnt*/
int writeCnt;
sem_t wtcntMutex;
sem_t queue;} cache_block;
cache_block cacheobjs[CACHE_OBJS_COUNT]; /*ten cache blocks*/
int cache_num;} Cache;
char url_store[100];
strcpy(url_store, uri); /*store the original url */
/*the uri is cached? */
int cache_index;
if((cache_index=cache_find(url_store))! =-1){
/*in cache then return the cache content*/
Rio_writen(connfd, cheobjs[cache_index]. cache_obj,
strlen(cheobjs[cache_index]. cache_obj));
/*store it*/
if(sizebuf < MAX_OBJECT_SIZE){ cache_uri(url_store, cachebuf);}} ### Cache functions void cache_init(){ che_num = 0; int i; for(i=0;i=CACHE_OBJS_COUNT) return -1; /*can not find url in the cache*/
return i;}
/*find the empty cacheObj or which cacheObj should be evictioned*/
int cache_eviction(){
int minindex = 0;
for(i=0; i


这个LAB 是上完CMU CSAPP的21-25 LECTURE之后,就可以做了。
csapp 课程观看地址:lab 6 下载地址: 选择PROXY LAB, 点击SELF-STUDY HANDOUT
这次的作业主要分三个部分(详情参见WRITE-UP ):
Sequential Proxy: 接收客户端发送的 HTTP 请求,解析之后向目标服务器转发,获得响应之后再转发回客户端
Concurrent Proxy: 在第一步的基础上,支持多线程
Cache Web Objects: 使用 LRU 缓存单独的对象,而不是整个页面
第一步,看懂TINY SERVER(HANDOUT里赠送)的代码。 就大概知道如何写一个SERVER。
第二步,根据WRITE-UP 4 Part I: Implementing a sequential web proxy
大概需要做如下编程工作。服务器端接受请求,解析GET HTTP/1. 1 转换为 GET /hub/ HTTP/1. 0, 同时拿到HOST 和 PORT,代理服务器自己作为CLIENT向目标发送HTTP 1. 0请求.
header 部分,先全部保持不变,随后改4个值,
Host: User-Agent: Mozilla/5. 0 (X11; Linux x86_64; rv:10. 0. 3) Gecko/20120305 Firefox/10. 3
Connection: close
Proxy-Connection: close
第三步 代码实现
3. 1 抄TINY SERVER的框架,把一些常量定义掉
#include “csapp. h”
/* Recommended max cache and object sizes */
#define MAX_CACHE_SIZE 1049000
#define MAX_OBJECT_SIZE 102400
/* You won’t lose style points for including this long line in your code */
static const char *user_agent_hdr = “User-Agent: Mozilla/5. 3\r\n”;
static const char *conn_hdr = “Connection: close\r\n”;
static const char *prox_hdr = “Proxy-Connection: close\r\n”;
void doit(int fd);
void clienterror(int fd, char *cause, char *errnum,
char *shortmsg, char *longmsg);
void parse_uri(char *uri, char *hostname, char *path, int *port);
void build_requesthdrs(rio_t *rp, char *newreq, char *hostname);
void *thread(void *vargp);
int main(int argc, char **argv)
int listenfd, *connfd;
pthread_t tid;
char hostname[MAXLINE], port[MAXLINE];
socklen_t clientlen;
struct sockaddr_storage clientaddr;
/* Check command line args */
if (argc! = 2) {
fprintf(stderr, “usage:%s \n”, argv[0]);
listenfd = Open_listenfd(argv[1]);
while (1) {
printf(“listening.. \n”);
clientlen = sizeof(clientaddr);
connfd = Malloc(sizeof(int));
*connfd = Accept(listenfd, (SA *)&clientaddr, &clientlen);
Getnameinfo((SA *) &clientaddr, clientlen, hostname, MAXLINE,
port, MAXLINE, 0);
printf(“Accepted connection from (%s, %s)\n”, hostname, port);
Pthread_create(&tid, NULL, thread, connfd);}}
/* Thread routine */
void *thread(void *vargp)
int connfd = *((int *)vargp);
return NULL;}
* doit – handle one HTTP request/response transaction
/* $begin doit */
void doit(int client_fd)
int endserver_fd;
char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
rio_t from_client, to_endserver;
/*store the request line arguments*/
char hostname[MAXLINE], path[MAXLINE];//path eg /hub/
int port;
/* Read request line and headers */
Rio_readinitb(&from_client, client_fd);
if (! Rio_readlineb(&from_client, buf, MAXLINE))
sscanf(buf, “%s%s%s”, method, uri, version);
if (strcasecmp(method, “GET”)) {
clienterror(client_fd, method, “501”, “Not Implemented”,
“Proxy Server does not implement this method”);
//parse uri then open a clientfd
parse_uri(uri, hostname, path, &port);
char port_str[10];
sprintf(port_str, “%d”, port);
endserver_fd = Open_clientfd(hostname, port_str);
if(endserver_fd<0){ printf("connection failed\n"); Rio_readinitb(&to_endserver, endserver_fd); char newreq[MAXLINE]; //for end server req headers //set up first line /hub/ HTTP/1. 0 sprintf(newreq, "GET%s HTTP/1. 0\r\n", path); build_requesthdrs(&from_client, newreq, hostname); Rio_writen(endserver_fd, newreq, strlen(newreq)); //send client header to real server int n; while ((n = Rio_readlineb(&to_endserver, buf, MAXLINE))) {//real server response to buf //printf("proxy received%d bytes, then send\n", n); Rio_writen(client_fd, buf, n); //real server response to real client}} /* $end doit */ * clienterror - returns an error message to the client /* $begin clienterror */ char *shortmsg, char *longmsg) char buf[MAXLINE], body[MAXBUF]; /* Build the HTTP response body */ sprintf(body, "Proxy Error“);
sprintf(body, “%s\r\n”, body);
sprintf(body, “%s%s:%s\r\n”, body, errnum, shortmsg);
sprintf(body, “%s

%s:%s\r\n”, body, longmsg, cause);
sprintf(body, “%s

The Proxy Web server\r\n”, body);
/* Print the HTTP response */
sprintf(buf, “HTTP/1. 0%s%s\r\n”, errnum, shortmsg);
Rio_writen(fd, buf, strlen(buf));
sprintf(buf, “Content-type: text/html\r\n”);
sprintf(buf, “Content-length:%d\r\n\r\n”, (int)strlen(body));
Rio_writen(fd, body, strlen(body));}
/* $end clienterror */
3. 2 实现2个辅助函数
在写PARSE URI方法前,我们得回顾下C 的STR的用法
void parse_uri(char *uri, char *hostname, char *path, int *port) {
*port = 80;
//uri char* pos1 = strstr(uri, “//”);
if (pos1 == NULL) {
pos1 = uri;} else pos1 += 2;
//printf(“parse uri pos1%s\n”, pos1);//pos1
char* pos2 = strstr(pos1, “:”);
/*pos1, pos2:8080/hub/ */
if (pos2! = NULL) {
*pos2 = ‘\0’; //pos1 strncpy(hostname, pos1, MAXLINE);
sscanf(pos2+1, “%d%s”, port, path); //pos2+1 8080/hub/
*pos2 = ‘:’;} else {
pos2 = strstr(pos1, “/”);//pos2 /hub/
if (pos2 == NULL) {/*pos1 /
strncpy(hostname, pos1, MAXLINE);
strcpy(path, “”);
*pos2 = ‘\0’;
*pos2 = ‘/’;
strncpy(path, pos2, MAXLINE);}}
void build_requesthdrs(rio_t *rp, char *newreq, char *hostname, char* port) {
//already have sprintf(newreq, “GET%s HTTP/1. 0\r\n”, path);
char buf[MAXLINE];
while(Rio_readlineb(rp, buf, MAXLINE) > 0) {
if (! strcmp(buf, “\r\n”)) break;
if (strstr(buf, “Host:”)! = NULL) continue;
if (strstr(buf, “User-Agent:”)! = NULL) continue;
if (strstr(buf, “Connection:”)! = NULL) continue;
if (strstr(buf, “Proxy-Connection:”)! = NULL) continue;
sprintf(newreq, “%s%s”, newreq, buf);}
sprintf(newreq, “%sHost:%s:%s\r\n”, newreq, hostname, port);
sprintf(newreq, “%s%s%s%s”, newreq, user_agent_hdr, conn_hdr, prox_hdr);
sprintf(newreq, “%s\r\n”, newreq);}
3. 3 测试
随后就根据PPT里的思路 用多线程的方式实现。
依然PART 2
随后根据这篇博客,和一版新的HINT 对我的代码进行优化
依然PART 2. 1 修改CSAPP. C做错误保护
如果有错,一律return 0
依然PART 2. 2 测试有没有File Descriptor泄漏
决定使用数组的方法,为了不浪费空间,决定采用分级数组的思想。(和MALLOC LAB很想)
因为最大缓存对象是100KB, 一共有1M的缓存空间。
我可以用5个100KB (500 KB)
25 KB 可以用12个。(300 KB)
随后10KB 可以用10个。 (100KB)
还有5KB的用20个,(100 KB)
1 KB 用 20个(20 KB)
100B的 用40个 (4KB)
第一步 定义数据结构
//cache. h
#define TYPES 6
extern const int cache_block_size[];
extern const int cache_cnt[];
typedef struct cache_block{
char* url;
char* data;
int datasize;
int64_t time;
pthread_rwlock_t rwlock;} cache_block;
typedef struct cache_type{
cache_block *cacheobjs;
int size;} cache_type;
cache_type caches[TYPES];
//intialize cache with malloc
void init_cache();
//if miss cache return 0, hit cache write content to fd
int read_cache(char* url, int fd);
//save value to cache
void write_cache(char* url, char* data, int len);
//free cache
void free_cache();
第二步 实现方法
这里我们用了读者写者模式,并且根据提示。不用严格的按照LRU。这是什么意思的,其实就是暗示我们在读的时候,需要去更新时间错,如果有别的线程也在更新同一个CACHE BLOCK。呢么就按照那个为准,TRY失败了不必强求。
//cache. c
#include “cache. h”
const int cache_block_size[] = {102, 1024, 5120, 10240, 25600, 102400};
const int cache_cnt[] = {40, 20, 20, 10, 12, 5};
int64_t currentTimeMillis();
void init_cache()
int i = 0;
for (; i < TYPES; i++) { caches[i] = cache_cnt[i]; caches[i]. cacheobjs = (cache_block *)malloc(cache_cnt[i] * sizeof(cache_block)); cache_block *j = caches[i]. cacheobjs; int k; for (k = 0; k < cache_cnt[i]; j++, k++) { j->time = 0;
j->datasize = 0;
j->url = malloc(sizeof(char) * MAXLINE);
strcpy(j->url, “”);
j->data = malloc(sizeof(char) * cache_block_size[i]);
memset(j->data, 0, cache_block_size[i]);
pthread_rwlock_init(&j->rwlock, NULL);}}}
void free_cache() {
free(caches[i]. cacheobjs);}}
int read_cache(char *url, int fd){
int tar = 0, i = 0;
cache_type cur;
cache_block *p;
printf(“read cache%s \n”, url);
for (; tar < TYPES; tar++) { cur = caches[tar]; p = cheobjs; for(i=0;i <; i++, p++){ if(p->time! = 0 && strcmp(url, p->url) == 0) break;}
if (i <) break;} if(i ==){ printf("read cache fail\n"); return 0;} pthread_rwlock_rdlock(&p->rwlock);
if(strcmp(url, p->url)! = 0){
if (! pthread_rwlock_trywrlock(&p->rwlock)) {
p->time = currentTimeMillis();
Rio_writen(fd, p->data, p->datasize);
printf(“read cache successful\n”);
return 1;}
void write_cache(char *url, char *data, int len){
int tar = 0;
for (; tar < TYPES && len > cache_block_size[tar]; tar++);
printf(“write cache%s%d\n”, url, tar);
/* find empty block */
cache_type cur = caches[tar];
cache_block *p = cheobjs, *pt;
int i;
for(i=0;i <;i++, p++){ if(p->time == 0){
/* find last visited */
int64_t min = currentTimeMillis();
for(i=0, pt = cheobjs;i<;i++, pt++){ if(pt->time <= min){ min = pt->time;
p = pt;}}}
p->datasize = len;
memcpy(p->url, url, MAXLINE);
memcpy(p->data, data, len);
printf(“write Cache\n”);}
int64_t currentTimeMillis() {
struct timeval time;
gettimeofday(&time, NULL);
int64_t s1 = (int64_t)(_sec) * 1000;
int64_t s2 = (_usec / 1000);
return s1 + s2;}
第三步 整合进现有CODE
3. 1 修改MAKE FILE
第四步 测试
valgrind –leak-check=full –show-leak-kinds=all. /proxy 45161

Frequently Asked Questions about proxy lab

Leave a Reply

Your email address will not be published. Required fields are marked *