Trouble getting the Request Body of a HTTP Post

Hi there,

I recently created a custom Nginx module, and successfully compiled it
into the Nginx source and used a location directive to direct traffic to
my module (good job me).

In my modules handler, I of course have access to the ngx_http_request_t
*r object passed into the function.

Using this object, I have no trouble:

  1. Getting the request headers (r->headers_in)
  2. Setting and sending the response headers (r->headers_out,
    ngx_http_send_header(r):wink:
  3. Send response body (return ngx_http_output_filter(r, &out):wink:

However, I’m having a very difficult time accessing the request body.
Let me explain:

  1. I run nginx
  2. I telnet to my localhost at port 80
  3. I enter something like this:

POST /mylocation HTTP/1.0
Host: 127.0.0.1
Content-Length: 13
value=Tronman

But in my hander function,
r->request_body is completely null!

However, I know the body is being read since if I look at
“r->request_line.data” and simply offset the pointer by the length of
the headers, I clearly see the “value=Tronman”. But I don’t feel this is
a very safe thing to do unless I can always be sure that the Request
Body will indeed be accessible this way and I know the total length of
the headers. I’m not making any calls to
ngx_http_discard_request_body(r); in the handler or anywhere else I’m
aware of.

I also tried using the ngx_http_read_client_request_body, but this
ultimately just makes r->request_header->buf point to gibberish. I used
gdb to trace through the code and it ultimately leads to a failed “recv”
socket function, for which the failure would make sence if the body had
already been read (which it seems to have been).

A few other details:
System: Ubuntu 9.10 32-bit
Nginx Version: 0.7.64

So ultimately it comes down to:

  1. Why isn’t the response body in r->response_body like I would expect?
  2. Would it be safe to access it by looking past request_line.data, and
    if so, how do I know the total header length?
  3. Is there another method I should be using to access the response
    body?
  4. Any other advice you might have would be welcome!

Thanks.

Posted at Nginx Forum:

Hello!

On Thu, Dec 17, 2009 at 11:35:16AM -0500, Tronman wrote:

I recently created a custom Nginx module, and successfully compiled it into the Nginx source and used a location directive to direct traffic to my module (good job me).

In my modules handler, I of course have access to the ngx_http_request_t *r object passed into the function.

[…]

So ultimately it comes down to:

  1. Why isn’t the response body in r->response_body like I would expect?

Your expectation is wrong. Handler functions are called before
request body has been read from client.

If you expect request body and want to read it, you should use
ngx_http_read_client_request_body().

  1. Would it be safe to access it by looking past request_line.data, and if so, how do I know the total header length?

No.

  1. Is there another method I should be using to access the response body?

You mean request body? Use ngx_http_read_client_request_body().

  1. Any other advice you might have would be welcome!

To see the difference between (2) and (3) you may consider using
some big POST (e.g. 10M) slowly uploaded by client. This also
should give you some idea about how to handle multiple request
body buffers and disk-buffered data.

Maxim D.

Hi there, Thanks for your response!

I’m still trying to get the hang of the I/O model going on here. Bare
with me to see if I understand:

When a connection comes in, the “ngx_http_my_module_handler” is called.
This function has the request headers, but not the request body. In
“ngx_http_my_module_handler”, we can use
“ngx_http_read_client_request_body(r, ngx_http_my_module_body_handler);”
to set a callback (handler) to be run when the Request Body is read
(yes, I did mean Request Body, sorry for my typo). Since the response is
dependant on the body, “ngx_http_my_module_handler” really doesn’t have
to do anything else after setting the call back but return NGX_DONE.

Thus, when the function “ngx_http_my_module_body_handler” gets called,
I’ve got both the original headers, and the request body, and it’s here
that I should place all of the logic to build my response and send it to
the client.

This implies that the client program connecting to the server must use
separate “send” calls to send the header first, then the body, and
cannot put both a header and a body into the same buffer and send them
at the same time. Is this the case?

To see the difference between (2) and (3) you may consider using
some big POST (e.g. 10M) slowly uploaded by client. This also
should give you some idea about how to handle multiple request
body buffers and disk-buffered data.

In this case, what would be the order of function calls? For example,
would “ngx_http_my_module_body_handler” be called for each piece of the
body (which I would write my own logic to handle) or called after the
entire (say 10M) body has been read. Is “ngx_http_my_module_handler”
still called only once when the header arises?

In the meantime, I have moved my response generating logic into the
“ngx_http_my_module_body_handler” which is working much better (for
single chunk bodies at least). Also, when I was telnet-ing, I was trying
to send headers and body in one chunk, which wasn’t working. Instead I
started sending them in separate chunks and am getting closer to the
behaviour I expected.

Thanks for your help!

Posted at Nginx Forum:

Hello!

On Thu, Dec 17, 2009 at 01:58:00PM -0500, Tronman wrote:

to be run when the Request Body is read (yes, I did mean Request
Body, sorry for my typo). Since the response is dependant on the
body, “ngx_http_my_module_handler” really doesn’t have to do
anything else after setting the call back but return NGX_DONE.

Yes.

But please note that once ngx_http_read_client_request_body()
returns something >= NGX_HTTP_SPECIAL_RESPONSE you should return
it from handler instead of NGX_DONE. Basically it means that your
body handler was already called (since full body was already
read from client) and you have to pass it’s return code. So code
in handler should look like:

rc = ngx_http_read_client_request_body(r,
                            ngx_http_my_module_body_handler);

if (rc >= NGX_HTTP_SPECIAL_RESPONSE) {
    return rc;
}

return NGX_DONE;

See e.g. proxy module for an example.

Thus, when the function “ngx_http_my_module_body_handler” gets
called, I’ve got both the original headers, and the request
body, and it’s here that I should place all of the logic to
build my response and send it to the client.

Yes.

This implies that the client program connecting to the server
must use separate “send” calls to send the header first, then
the body, and cannot put both a header and a body into the same
buffer and send them at the same time. Is this the case?

No. As long as http request is correct it doesn’t matter if it
was send in one packet or not.

To see the difference between (2) and (3) you may consider
using some big POST (e.g. 10M) slowly uploaded by client. This
also should give you some idea about how to handle multiple
request body buffers and disk-buffered data.

In this case, what would be the order of function calls? For
example, would “ngx_http_my_module_body_handler” be called for
each piece of the body (which I would write my own logic to
handle) or called after the entire (say 10M) body has been read.

It’s called once entire body has been read.

Is “ngx_http_my_module_handler” still called only once when the header arises?

Yes.

In the meantime, I have moved my response generating logic into
the “ngx_http_my_module_body_handler” which is working much
better (for single chunk bodies at least). Also, when I was
telnet-ing, I was trying to send headers and body in one chunk,
which wasn’t working. Instead I started sending them in separate
chunks and am getting closer to the behaviour I expected.

Most likely you did something wrong, either while telneting or in
you module.

Maxim D.

curl is a good way to test URLs without having to re-type everything
each time. You can POST by using the -d or -f option. Much better
than telnet for this type of testing, in my experience.

This implies that the client program connecting to the server must use separate “send” calls to send the > header first, then the body, and cannot put both a header and a body into the same buffer and send
them at the same time. Is this the case?

Ah, I’m wrong on this. It’s just the screwy way Telnet was sending the
data chunks.

Posted at Nginx Forum:

Most likely you did something wrong, either while
telneting or in you module.

Precisely, it was a mistake I was making with my telnet session. I
misinterpreted “sending” as really just inserting a couple of new lines.
Also, it worked properly when I wrote a simple form to POST in Firefox.

Great! Your response cleared up nearly everything for me, the only other
thing I’m not completely clear on is the difference between:

ngx_chain_t *bufs;

and

ngx_buf_t *buf;

in ngx_http_request_body_t. What is the difference between them? When
should I use either to access the body?

I seem to be able to access:

r->request_body->buf->start and
r->request_body->buf->pos

the same as I can use:

r->request_body->bufs->buf->start and
r->request_body->bufs->buf->pos

While r->request_body->bufs->next is NULL. So obviously the chain is
some sort of linked list.

My only clue here are the:

client_body_in_single_buffer and
client_body_buffer_size

directives in NginxHttpCoreModule

So if I specify “client_body_in_single_buffer” as “true”, does it store
the entire client request in r->request_body->buf?
But if not, does it separate the client request body into the chunks of
whatever client_body_buffer_size is set to, and then you use the chain
to gain access to the entire thing?

Thanks again!

Posted at Nginx Forum:

Hello!

On Thu, Dec 17, 2009 at 03:33:04PM -0500, Tronman wrote:

[…]

in ngx_http_request_body_t. What is the difference between them?
When should I use either to access the body?

You should use ->bufs. Here is the comment from
ngx_http_request_body.c which should be helpful:

/*

  • on completion ngx_http_read_client_request_body() adds to
  • r->request_body->bufs one or two bufs:
  • *) one memory buf that was preread in r->header_in;
  • *) one memory or file buf that contains the rest of the body
    */

Simple example of using request_body may be found in
ngx_http_variables.c, in function
ngx_http_variable_request_body().

[…]

So if I specify “client_body_in_single_buffer” as “true”, does
it store the entire client request in r->request_body->buf?

No. It will be in first buffer of r->request_body->bufs, either
in memory if it fits into client_body_buffer_size or in file.

But if not, does it separate the client request body into the
chunks of whatever client_body_buffer_size is set to, and then
you use the chain to gain access to the entire thing?

Not exactly. Currently r->request_body->bufs will contain up to two
buffers: one for data preread with headers, and one for data got
later. If there are more data than client_body_buffer_size - the
last buffer will be in temporary file.

It’s probably a good idea to spend some time reading
ngx_http_request_body.c for better clue what happens in various
cases.

Maxim D.

Nick P. Wrote:

curl is a good way to test URLs without having to
re-type everything
each time. You can POST by using the -d or -f
option. Much better
than telnet for this type of testing, in my
experience.

Yes, I’ve heard of cURL but hadn’t used it a whole lot. I’ve installed
it and will give it a try, thanks!

Maxim D. Wrote:

rest of the body
*/

Simple example of using request_body may be found
in
ngx_http_variables.c, in function
ngx_http_variable_request_body().

Ah, I was looking for something like that. Thanks!

one for data got
later. If there are more data than
client_body_buffer_size - the
last buffer will be in temporary file.

It’s probably a good idea to spend some time
reading
ngx_http_request_body.c for better clue what
happens in various
cases.

Maxim D.

Gotcha. I think I’m getting a handle on what’s going on. Thanks again
for all your help!

Posted at Nginx Forum:

Great discussion even though this was posted a year ago! I was looking
for information on capturing the POST request body and this cleared
things up quite a bit on what is going on under the hood. My problem is
slightly different in that I just want to log the request body.

I have my config setup to handle a bunch of GET requests which render
pixels that work fine to handle analytics and parse query strings for
logging. With an additional third party data stream, I need to handle a
POST request to a given url that has JSON in an expected loggable format
inside of it’s request body. I don’t want to use a secondary server with
proxy_pass and just want to log the whole response into an associated
log file like what it does with GET requests. A snippet of some code
that I’m using looks like the following:

GET request (which works great):

location ^~ /rl.gif {
    set $rl_lcid $arg_lcid;
    if ($http_cookie ~* "lcid=(.*\S)")
    {
      set $rl_lcid $cookie_lcid;
    }

    empty_gif;
    log_format my_tracking '{ "guid" : "$rl_lcid", "data" :
"$arg__rlcdnsegs" }';
    access_log  /mnt/logs/nginx/my.access.log my_tracking;
    rewrite ^(.*)$ http://my/url?id=$cookie_lcid? redirect;
  }

Here is kinda what I am trying to do:
POST request (which does not work):

location /bk {
    log_format bk_tracking $request_body;
    access_log  /mnt/logs/nginx/bk.access.log bk_tracking;
  }

Curling “curl http://myurl/bk -d name=example” gives me a 404 page not
found.

Then I tried:

location /bk.gif {
    empty_gif;
    log_format bk_tracking $request_body;
    access_log  /mnt/logs/nginx/bk.access.log bk_tracking;
  }

Curling “curl http://myurl/bk.gif -d name=example” gives me a 405 Not
Allowed.

My current version is nginx/0.7.62. Any help in the right direction is
very much appreciated! Thanks!

Posted at Nginx Forum: