Can Anyone Explain This Memory Leak?

Hi Folks,

Sorry to get your attention. :slight_smile:

There’s a very strange problem with Mongrel where if Threads are created
because of the Mutex around Rails dispatching, then lots of ram gets
created that never seems to go away.

I boiled the problem down to this:

http://pastie.caboo.se/10194

It’s a graph of the “leak” and the base code that causes it (nothing
Mongrel in it at all). This code kind of simulates how Mongrel is
managing threads and locking Rails.

What this code does is create threads until there’s 1000 in a
ThreadGroup waiting on a Mutex. Inside the guard 30000 integers are put
inside an Array. Don’t let this distract you since it can be strings,
or even nothing and you’ll see the same thing. It’s just to simulate
Rails creating all the stuff it creates, and to demonstrate that while
these objects should go away, they do not.

Then it waits in 10 second increments for these threads to go away,
calling GC.start each time.

And what happens is the graph you see (samples of mem usage of the ruby
process 1/second after 3 cycles of create/destroy threads). Rather than
the memory for the threads and the array of integers going away, it
sticks around. It’ll dip a little bit, but not much, just tops out
there and doesn’t die. Even though all the threads are clearly gone and
none of their contents should be around.

In contrast, if you remove the Mutex then the ram behaves as you’d
expect, with it going up and then going away.

I’m hoping people way smarter with Ruby than myself can tell me why this
happens, what is wrong with this code, and how to fix it.

Thanks.

Have you tried to run it with the latest 1.8.5 prerelease?
from changelog:

Thu Dec 29 23:59:37 2005 Nobuyoshi N. [email protected]

    * eval.c (rb_gc_mark_threads): leave unmarked threads which 

won’t wake
up alone, and mark threads in the loading table.
[ruby-dev:28154]

    * eval.c (rb_gc_abort_threads), gc.c (gc_sweep): kill unmarked
      threads.  [ruby-dev:28172]

On Fri, 25 Aug 2006, Zed S. wrote:

Parked at Loopia
these objects should go away, they do not.

In contrast, if you remove the Mutex then the ram behaves as you’d
expect, with it going up and then going away.

I’m hoping people way smarter with Ruby than myself can tell me why this
happens, what is wrong with this code, and how to fix it.

Thanks.

hi zed-

i don’t think you have a leak. try running under electric fence (ef).
when i
do i clearly see the memory rise from 1->20% on my desktop, and then
decline
back down to 1%, over and over with no reported leaks. the cycle
matches
the logging of the script perfectly.

here’s the thing though, when i don’t run it under electric fence i see
the
memory climb to about 20% and then stay there forever. but this too
does not
indicate a leak. it just shows how calling ‘free’ in a process doesn’t
really
release memory to the os, only to the process itself. the reason you
see the
memory vary nicely under ef is that it replaces the standard malloc/free
with
it’s own voodoo - details of which i do not understand or care too. the
point, however, is that it’s ‘free’ which is doing the ‘leaking’ - just
at the
os level, not the process (ruby) level. we have tons of really long
running
processes that exhibit the exact same behaviour - basically the memory
image
will climb to maximum and stay there. oddly, however, when you tally
them all
up the usage exceeds the system capacity plus swap by miles.

i think this is just illustraing reason 42 why i prefer Kernel.fork to
Thread.new - the only real way to return memory to the os is to exit!
:wink:

regards.

-a

On Fri, 2006-08-25 at 14:33 +0900, [email protected] wrote:

i think this is just illustraing reason 42 why i prefer Kernel.fork to
Thread.new - the only real way to return memory to the os is to exit! :wink:

And why Apache with pre-fork is good for long-running applications :slight_smile:

Greetings,
JS

[email protected] wrote:

indicate a leak. it just shows how calling ‘free’ in a process doesn’t
processes that exhibit the exact same behaviour - basically the memory
image
will climb to maximum and stay there. oddly, however, when you tally
them all
up the usage exceeds the system capacity plus swap by miles.

So, to test this hypothesis, the OP could try to instantiate a large
number of objects, and see if there is no effect on the vmsize reported
by the OS, right? Because those objects should be able to use the memory
that is owned by the process, but not used by ruby objects.

On Fri, 2006-08-25 at 14:33 +0900, [email protected] wrote:

I boiled the problem down to this:
here’s the thing though, when i don’t run it under electric fence i see the

Nope, I can’t agree with this because the ram goes up, the OS will kill
it eventually, and if I remove the guard the ram doesn’t do this.

And where are you getting your information that free doesn’t free
memory? I’d like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just “fake” reporting (which
is very hard to believe) then why are all these things happening?

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

On Fri, 25 Aug 2006, Srinivas JONNALAGADDA wrote:

On Fri, 2006-08-25 at 14:33 +0900, [email protected] wrote:

i think this is just illustraing reason 42 why i prefer Kernel.fork to
Thread.new - the only real way to return memory to the os is to exit! :wink:

And why Apache with pre-fork is good for long-running applications :slight_smile:

amen!

fork + drb takes it every time :wink:

-a

On Fri, 2006-08-25 at 14:46 +0900, Joel VanderWerf wrote:

So, to test this hypothesis, the OP could try to instantiate a large
number of objects, and see if there is no effect on the vmsize reported
by the OS, right? Because those objects should be able to use the memory
that is owned by the process, but not used by ruby objects.

This is actually what I refer to in the no-Mutex situation. Create a
ton of threads without a mutex in them and the ram goes away. The
evidence doesn’t support the claims at all.

Also the fact that the OS is killing these processes and swap is getting
used indicates that this is real memory being lost.

But, a reply from Kent S. says this could be a bug in 1.8.4. So
there’s even more evidence that it is a leak.

On Fri, 2006-08-25 at 14:41 +0900, Kent S. wrote:

You gotta be kidding me. A damn bug? Oh no, according to ara.t.howard
it’s because free doesn’t actually free.

Man, two days wasted for nothing.

I’ll try 1.8.5 tomorrow. I’m kind of tired of this to be honest.

On Fri, 25 Aug 2006, Zed S. wrote:

And where are you getting your information that free doesn’t free
memory? I’d like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

a nice explanation:

“A couple of things: Some malloc' implementations use mmap() to allocate large blocks (sometimes the threshold is a page or two, sometimes more), so this might be part of what you're seeing. Some programs have allocation patterns that interact badly with the way certain allocators work. Often, for example, when some number of objects of a certain size have been allocated, a future allocation cuts up a page into chunks of that size, gives you one, and throws the rest onto a free list. If the allocation pattern of a program is to allocate many chunks of a certain size, free them, and then allocate many chunks of a somewhat larger size, the allocator can't satisfy the latter requests (as bunches of somewhat-too-small chunks are on the free lists) without grabbing more address space from the OS (via sbrk()). This is not necessarily a bad thing, even though it makes it look like the overall size of the program is expanding; though the address space may have grown, the pages containing the somewhat-too-small’ chunks that have been freed are
eventually
swapped out; unless they’re touched again their only downside is
consumption
of swap space. It’s really only a problem for programs with very
large
footprints; even in cases like that, at some point most malloc
implementations
will `unslice’ space from previously sliced pages.”

http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519

the paper referenced is also good.

ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps

regards.

-a

On Fri, 25 Aug 2006, Zed S. wrote:

Nope, I can’t agree with this because the ram goes up, the OS will kill it
eventually, and if I remove the guard the ram doesn’t do this.

that makes perfect sense. with the gaurd only one thread at a time can
be
initializing the huge array at once, it takes time, because of this the
number
of threads grows quite large - thus the large amount of memory consumed
as
they are added the the thread group. without the mutex the threads
simply
race right through their work - creating the array and quickly exiting.

so, when the threads can be created and die quickly the maximum memory
used by
the process simly never gets that big. with the mutex the maximum
memory is
larger simply because the threads take longer to run.

run this on your system:

 harp:~ > cat a.rb
 pid = Process.pid
 a = []
 new_array = lambda{|size| Array.new(size){ 42 }.map}
 eat_memory = lambda{|n| n.times{ a << new_array[4242] }}
 free_memory = lambda{a.clear and GC.start }

 eat_memory[4242]
 puts
 puts "after malloc"
 system "ps v #{ pid }"

 free_memory[]
 puts
 puts "after free"
 system "ps v #{ pid }"


 harp:~ > ruby a.rb

 after malloc
   PID TTY      STAT   TIME  MAJFL   TRS   DRS  RSS %MEM COMMAND
 31264 pts/10   R+     0:24      0   595 27164 26364  15.6 ruby a.rb

 after free
   PID TTY      STAT   TIME  MAJFL   TRS   DRS  RSS %MEM COMMAND
 31264 pts/10   S+     0:24      0   595 27164 26368  15.6 ruby a.rb

it shows the sort of behaviour i’m talking about

And where are you getting your information that free doesn’t free
memory? I’d like to read that since all my years of C coding says that
is dead wrong.

http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/44a5705312cc5df9/829197acdf6651ac?lnk=gst&q=memory+not+really+freed&rnum=4#829197acdf6651ac
http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519

the thing is that this has nothing to do with c and everything to do
with os

Advanced Memory Allocation | Linux Journal

"When a process needs memory, some room is created by moving the upper
bound of
the heap forward, using the brk() or sbrk() system calls. Because a
system call
is expensive in terms of CPU usage, a better strategy is to call brk()
to grab
a large chunk of memory and then split it as needed to get smaller
chunks. This
is exactly what malloc() does. It aggregates a lot of smaller malloc()
requests
into fewer large brk() calls. Doing so yields a significant performance
improvement. The malloc() call itself is much less expensive than brk(),
because it is a library call, not a system call. Symmetric behavior is
adopted
when memory is freed by the process. Memory blocks are not immediately
returned
to the system, which would require a new brk() call with a negative
argument.
Instead, the C library aggregates them until a sufficiently large,
contiguous
chunk can be freed at once.

For very large requests, malloc() uses the mmap() system call to find
addressable memory space. This process helps reduce the negative effects
of
memory fragmentation when large blocks of memory are freed but locked by
smaller, more recently allocated blocks lying between them and the end
of the
allocated space. In this case, in fact, had the block been allocated
with
brk(), it would have remained unusable by the system even if the process
freed
it."

Care to tell me how malloc/free would report 80M with Mutex but properly show
the ram go down when there is no-Mutex?

it’s a sort of race

And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just “fake” reporting (which
is very hard to believe) then why are all these things happening?

all memory is fake. that’s why you can do this

harp:~ > free -b
total used free shared buffers
cached
Mem: 1051602944 1032327168 19275776 0 145215488
563167232
^^^^^^^^

harp:~ > ruby -e’ way_too_big = “42” * 19275776; p way_too_big.size

38551552

:wink:

regarding the crashes - there are still limits whic something is
apparently
exceeding.

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

http://groups.google.com/group/comp.unix.programmer/search?group=comp.unix.programmer&q=memory+not+really+freed&qt_g=1&searchnow=Search+this+group

in any case, try running under electric fence, which will use a cleaner
malloc/free and show the ‘real’ behaviour. it really does seem fine.

regards.

-a

Zed S. wrote:

up the usage exceeds the system capacity plus swap by miles.
And why Linux would kill processes if the ram get too high? Why whole
VPS servers crash? I mean if this ram was just “fake” reporting (which
is very hard to believe) then why are all these things happening?

So please, point me at where in the specifications for malloc/free on
Linux it says that the memory reported will be high even though free and
malloc is called on 80M of ram slowly cycled out, and that linux will
still kill your process even though this ram is not really owned by the
process.

I’ve had a lot of experience with the “bizarre” behavior of the Linux
memory manager. First of all, which kernel do you have? Second, the
Linux out-of-memory killer can be turned off (also kernel-dependent).
Finally, Linux has this philosophy that “free memory is wasted memory”,
and “I am Linux, you are the user, I know what’s good for me, if that
works for you, great!” The combination of these means that things that
make sense to you or to a performance engineer might not actually be
happening. :slight_smile:

Oh, yeah, how much physical RAM do you have? If you have a recent
kernel, which of the dozens of memory configuration options are you
using? Have you considered switching to another OS? :slight_smile:

Cliff C. wrote:

This isn’t considered bizarre behavior and in fact is a feature. Even
after you shutdown an application it keeps shared libs and such in
memory (cache). That way the system as a whole is much quicker when you
access applications that call these shared libs. No point in having the
overhead of re-loading things into memory over and over again. If you
get to the point where you use additional applications and there isn’t
free ram it flushes some of the cached objects out so that you don’t
have to swap.

As a performance engineer who works with managers and capacity planners
of large systems, I consider a memory manager that can’t be tuned to a
workload bizarre. Performance comes in many dimensions, and we ask our
Linux systems to serve numerous roles.

Sometimes we want them to process large batch jobs for rapid turnaround,
sometimes we want them to provide rapid interactive response to
thousands of people logged on to an application, sometimes we want them
to manage a huge DBMS, sometimes we want them to serve up web pages,
sometimes we want them to participate in a high-performance cluster, and
sometimes we want them to be scientific workstations. The idea that a
single memory management algorithm/code without tuning parameters can
serve all of those needs is ludicrous.

I don’t know if the 2.6 kernel is genuinely better than 2.4, I’ve gotten
smarter about the Linux memory manager, or both. :slight_smile: What I do know is
that just about all forms of the 2.4 Linux kernel, even the carefully
tuned ones in RHEL 3, behave in unpredictable and less than useful ways
when people do even moderately stupid things. I think a server ought to
be able to deal with a memory leak in a web server application in a more
productive way than the out of memory killer!

This isn’t considered bizarre behavior and in fact is a feature. Even
after you shutdown an application it keeps shared libs and such in
memory (cache). That way the system as a whole is much quicker when you
access applications that call these shared libs. No point in having the
overhead of re-loading things into memory over and over again. If you
get to the point where you use additional applications and there isn’t
free ram it flushes some of the cached objects out so that you don’t
have to swap.

On Fri, 2006-08-25 at 16:10 +0900, [email protected] wrote:

On Fri, 25 Aug 2006, Zed S. wrote:

And where are you getting your information that free doesn’t free
memory? I’d like to read that since all my years of C coding says that
is dead wrong. Care to tell me how malloc/free would report 80M with
Mutex but properly show the ram go down when there is no-Mutex?

a nice explanation:

> http://groups.google.com/group/comp.unix.programmer/browse_frm/thread/23e7be26dd21434a/2f84b3dc080c7519?lnk=gst&q=memory+not+really+freed&rnum=9#2f84b3dc080c7519 > > the paper referenced is also good. > > ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps

So, a posting to a news group from some guy in 2001 (5 years old) and a
paper written in 1995 (11 years old) that has references to papers as
old as 1964, none of which say that measurements of RAM will behave as
I’ve demonstrated today. Then more usenet postings (still no C code
that demonstrates this magic), and finally a recent article from
linuxjournal describing the Linux memory, but nothing that really says
memory will stay around at 80M levels even 20-30 seconds after all ram
has been supposedly freed.

Riiight. Sounds like I’ll just go back to my now working program with
it’s fancy “Sync” instead of “Mutex” technology, since obviously there’s
no memory leak (even though we consistently demonstrate this fixes it in
several situations).

http://pastie.caboo.se/10317

http://pastie.caboo.se/10194

On Sat, 26 Aug 2006, Cliff C. wrote:

This isn’t considered bizarre behavior and in fact is a feature. Even after
you shutdown an application it keeps shared libs and such in memory (cache).
That way the system as a whole is much quicker when you access applications
that call these shared libs. No point in having the overhead of re-loading
things into memory over and over again. If you get to the point where you
use additional applications and there isn’t free ram it flushes some of the
cached objects out so that you don’t have to swap.

convincing zed of this might be hard :wink:

-a

Marshall T. Vandegrift wrote:

As Ara described in another message in this thread, Unix-y heap implementations
(including glibc’s) typically get new process memory from the kernel by calling
brk() to extend the process’s data segment. This can increase the apparent
memory usage of code because the heap implementation can only move the end of
the data segment backwards (releasing memory to the system) as far as the last
page containing in-use memory.
Patient: Doctor, it hurts when I do that!
Doctor: OK … don’t do that!

Patient: I have this pain right here.
Doctor: Have you ever had this before?
Patient: No.
Doctor: Well, you have it now!

eventually bringing the process under the gaze of the OOM killer.

I hope this helps.

[1] I used glibc’s provided hooks for malloc/realloc/free and wrote a
quick-and-dirty custom trampoline for brk(). Yes, I know about ltrace, but
it only caught brk() as a system call, which slowed things down by a factor
of 1000 (!). And yes Solaris DTrace would have made this easy, but I made
my graphs while waiting for the 2.5Gb Solaris Express DVD to finish
downloading. :slight_smile: I can provide my code if anyone wants it.

I’m more interested in what these tests do on Solaris :). We have
Windows, Linux, Mac OS and someone I think ran it on BSD. Just out of
curiosity, will these test cases run on YARV?

Zed S. [email protected] writes:

I boiled the problem down to this:

Parked at Loopia

And this one, from the other thread:

http://pastie.caboo.se/10317

It’s a graph of the “leak” and the base code that causes it (nothing
Mongrel in it at all). This code kind of simulates how Mongrel is
managing threads and locking Rails.

Try out these graphs:

http://pastie.caboo.se/10550
http://pastie.caboo.se/10551

On an otherwise quiet GNU/Linux system, I ran each of your scripts for
half an
hour in a ruby process into which I loaded hooks for malloc(),
realloc(),
free(), and brk() [1]. The red line is the amount of memory ruby is
consuming
from the heap. The green line is the amount of memory the heap
implementation
is consuming from the system.

As Ara described in another message in this thread, Unix-y heap
implementations
(including glibc’s) typically get new process memory from the kernel by
calling
brk() to extend the process’s data segment. This can increase the
apparent
memory usage of code because the heap implementation can only move the
end of
the data segment backwards (releasing memory to the system) as far as
the last
page containing in-use memory.

In the graphs I’ve created we can clearly see that in both cases ruby’s
use of
memory from the heap drops back around the baseline with each iteration.
In
the ‘sync.rb’ case, the heap implementation’s use of memory from the
system
tracks user code heap usage pretty closely. In the ‘mutex.rb’ case,
however,
we see two anomalies: (a) the heap implementation’s use of system memory
stays
right near the maximum amount taken by ruby from the heap, and (b) the
maximum
memory used is around twice that of ‘sync.rb’.

Ara also mentioned in another message that Mutex is faster than
Synchronize,
and hypothesized that this was leading to a pathological interaction
with the
ruby garbage collector when creating as many threads as your example
does. My
data suggest that the observed “Mutex memory leak” /is/ caused by
Mutex’s
relative speed, but that it is due to an interaction with the system
heap
implementation rather than with ruby’s garbage collector. I haven’t
track down
the exact factors at work, but I’d guess that ‘mutex.rb’ is (a)
allocating new
memory within the time window the heap implementation holds onto unused
system
memory, and (b) under some circumstances sufficiently fragments memory
to force
the heap implementation to allocate more system memory than required,
eventually bringing the process under the gaze of the OOM killer.

I hope this helps.

[1] I used glibc’s provided hooks for malloc/realloc/free and wrote a
quick-and-dirty custom trampoline for brk(). Yes, I know about
ltrace, but
it only caught brk() as a system call, which slowed things down by a
factor
of 1000 (!). And yes Solaris DTrace would have made this easy, but
I made
my graphs while waiting for the 2.5Gb Solaris Express DVD to finish
downloading. :slight_smile: I can provide my code if anyone wants it.

-Marshall

On Aug 26, 2006, at 5:14 AM, Zed S. wrote:

a nice explanation:
So, a posting to a news group from some guy in 2001 (5 years old)
and a
paper written in 1995 (11 years old) that has references to papers as
old as 1964, none of which say that measurements of RAM will behave as
I’ve demonstrated today.

brk and sbrk are still used, for example, phkmalloc:

http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/malloc.c?
rev=1.90.2.1&content-type=text/x-cvsweb-markup&only_with_tag=RELENG_6

phkmalloc returns memory to the OS when an entire page is clean. In
Ruby a page may not always be clean because ruby heap slots may be
scribbled upon it. See add_heap in gc.c:

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi/ruby/gc.c?
rev=1.168.2.45;content-type=text%2Fx-cvsweb-
markup;only_with_tag=ruby_1_8

A good way to test this theory would be to either increase the number
of items per ruby heap slot by editing gc.c and recompiling, or using
many more threads. (Maybe I will do that.)

The real difference may be that Sync and Mutex have different memory
usage profiles.


Eric H. - [email protected] - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

On Aug 28, 2006, at 7:26 PM, Eric H. wrote:

brk and sbrk are still used, for example, phkmalloc:

re: my previous post, looking for brk and sbrk will be informative also

Gary W.