Is r1303 unstable for others, too?

Urban_H · November 24, 2006, 5:50pm

Hej!

I’ve just updated Typo to r1303 today. After migrating the database to
version 58 everything seemed to work fine. Unfortunately my fcgi
instance seems to die after a few minutes. But I don’t get any entry in
any log/ file. Should I look somewhere else?

I’m running on Textdrive if that makes any difference. But I don’t
think I’m hitting the memory limit as I’m not getting any messages
from the process killing big processes (how was that called again?). At
least I don’t think I am. It should write something into a file in my
home dir, shouldn’t it?

Also I’m using the new tmcode macro. Maybe that may be part of my
problem?

Urban

Urban_H · November 24, 2006, 6:28pm

“Urban H.” [email protected] writes:

home dir, shouldn’t it?
I don’t know about the workings of TextDrive; I do know they have
pretty tight limits on memory use, but I don’t know where you’d find
out about what processes got reaped.

Certainly dying without informing the logfile of anything smacks of
being 'kill -9’d by a resource limiter.

Do you get anything at all added to your logfile?

Also I’m using the new tmcode macro. Maybe that may be part of my
problem?

Possibly. I’ve not looked at its workings to be honest.

Urban_H · November 24, 2006, 7:52pm

If it was killed by TxD’s samurai process, then you will see a
process_watchdog.log in your home directory which lists why/when/what
was
killed.

-Linda

Urban_H · November 24, 2006, 8:22pm

“Urban H.” [email protected] writes:

least I don’t think I am. It should write something into a file in my
home dir, shouldn’t it?

Also I’m using the new tmcode macro. Maybe that may be part of my
problem?

What happens if you roll back to r1299?

That takes it back to rails 1.1.6. If it’s stable, I’d appreciate it
if you could then step forward to r1300 check that for stability.
If it’s stable, move forward one step at a time checking each version
for stability, which should help us nail down which specific changes
are responsible for what I’m assuming is a memory leak issue.

Urban_H · November 24, 2006, 11:51pm

“Linda D.” [email protected] writes:

If it was killed by TxD’s samurai process, then you will see a
process_watchdog.log in your home directory which lists why/when/what was
killed.

Fair enough. I wonder what is causing it then. I can’t replicate the
issue here at the moment, but I’m giving whiteboards a hard look at
the moment because they don’t quite work right at the moment.

Urban_H · November 26, 2006, 9:40am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Nov 24, 2006, at 20:27 , Linda D. wrote:

If it was killed by TxD’s samurai process, then you will see a
process_watchdog.log in your home directory which lists why/when/
what was
killed.

Well, there’s no such file. So that’s not what is happening it seems.

Thanks for the tip.

Urban

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFFaVFCggNuVCIrEyURApx/AJwJK02aPzhOtd4chOEKqXYGlcflTgCdFth7
I+mhqdpLhOVkprSATCQv+Zc=
=zRAs
-----END PGP SIGNATURE-----

Urban_H · November 26, 2006, 9:40am

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Nov 24, 2006, at 20:24 , Piers C. wrote:

What happens if you roll back to r1299?

That takes it back to rails 1.1.6. If it’s stable, I’d appreciate it
if you could then step forward to r1300 check that for stability.
If it’s stable, move forward one step at a time checking each version
for stability, which should help us nail down which specific changes
are responsible for what I’m assuming is a memory leak issue.

I’ll try tomorrow. Is there anything special to be aware of? Like
having to migrating the database back?

Urban

http://bettong.net - Urban’s Blog

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFFaVJZggNuVCIrEyURAtm3AJ9cpFB8eQEP9fC+n1YuQ0DakrbBxgCdFadD
YBBKs7IpaY44ozqxKqcSr6I=
=4Qs2
-----END PGP SIGNATURE-----

Urban_H · November 27, 2006, 11:17am

On 11/26/2006, “Piers C.” [email protected] wrote:

are responsible for what I’m assuming is a memory leak issue.

I’ll try tomorrow. Is there anything special to be aware of? Like
having to migrating the database back?

I did this and r1299 seems to be stable. At least at the test I used to
trigger this memory leak:

Go to homepage
Go to main admin page
Click empty fragment cache
Reload homepage
Boom!

When I do this at r1300 I get the following on my console:

[FATAL] failed to allocate memory

Neither the fastcgi.crash.log nor the production.log contain any error
messages. Should I try and switch to development. Does this give more
output?

Urban

Urban_H · November 27, 2006, 5:16pm

“Urban H.” [email protected] writes:

If it’s stable, move forward one step at a time checking each version

Go to main admin page
output?
Development probably doesn’t give more output, and probably breaks, if
anything earlier. But it might be worth trying.

One other thing to try is to change blog.rb, text_filter.rb and
user.rb back to using ActiveRecord::Base rather than CachedModel as
their super classes.

If you do remove the CachedModel inheritance stuff, you can also
modify the beginning of app/controllers/application.rb so the opening
stanza looks like:

class ApplicationController < ActionController::Base
include LoginSystem

before_filter :reset_local_cache, :fire_triggers

before_filter :fire_triggers

after_filter :reset_local_cache

I’m afraid this sort of debugging is unlikely to be quick – I run
with a different hosting provider, or I could probably get at what’s
up a wee bit quicker. Thanks for your help with this.

Urban_H · November 26, 2006, 11:17pm

Urban H. [email protected] writes:

I’ll try tomorrow. Is there anything special to be aware of? Like
having to migrating the database back?

Good point. rake db:migrate VERSION=55 before you do the roll back.

Urban_H · November 28, 2006, 4:47pm

Kevin B. [email protected] writes:

Incidentally, Piers, if tmcode is causing a problem (such as increased
memory usage) I would say that points to an issue with the whiteboard
implementation. Since you said you’re giving whiteboards a hard look,
hopefully if that’s the case you’ll find it.

On the other hand, tmcode could just be a red herring.

Seems to be; 1299 works and 1300 doesn’t.

Urban_H · December 8, 2006, 4:08pm

On 11/27/2006, “Piers C.” [email protected] wrote:

class ApplicationController < ActionController::Base
include LoginSystem

before_filter :reset_local_cache, :fire_triggers

before_filter :fire_triggers

after_filter :reset_local_cache

I’ve updated to r1324, changed the files you suggested but I still get
the same result. Attached are the log files.

Urban

Urban_H · December 9, 2006, 3:59pm

“Urban H.” [email protected] writes:

modify the beginning of app/controllers/application.rb so the opening
stanza looks like:

class ApplicationController < ActionController::Base
include LoginSystem

before_filter :reset_local_cache, :fire_triggers

before_filter :fire_triggers

after_filter :reset_local_cache

I’ve updated to r1324, changed the files you suggested but I still get
the same result. Attached are the log files.

Ah… try running in production mode; development mode leaks memory.

Try going back to a vanilla r1324 and uncommenting the
RAILS_ENV=production line in config/environment.rb

Urban_H · November 28, 2006, 1:20pm

Typo-list mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/typo-list

Urban_H · December 12, 2006, 12:38am

“Urban H.” [email protected] writes:

On 12/9/2006, “Piers C.” [email protected] wrote:

Ah… try running in production mode; development mode leaks memory.

Try going back to a vanilla r1324 and uncommenting the
RAILS_ENV=production line in config/environment.rb

I actually run in production mode normally. I tried it again with r1324
(and the scribbish theme instead of one not supplied in the Typo tree)
and I get the same error.

So, what do we know?

Migrations went smoothly and you’re on schema version 61
Textdrive isn’t killing stuff because it’s not putting any reports
in your home directory
Typo’s dying before it logs anything.

Does it always die in the same place?

Urban_H · December 11, 2006, 6:33pm

On 12/9/2006, “Piers C.” [email protected] wrote:

Ah… try running in production mode; development mode leaks memory.

Try going back to a vanilla r1324 and uncommenting the
RAILS_ENV=production line in config/environment.rb

I actually run in production mode normally. I tried it again with r1324
(and the scribbish theme instead of one not supplied in the Typo tree)
and I get the same error.

Urban

Urban_H · December 22, 2006, 3:59pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Dec 12, 2006, at 0:32 , Piers C. wrote:

So, what do we know?

Migrations went smoothly and you’re on schema version 61

Textdrive isn’t killing stuff because it’s not putting any reports
in your home directory

Typo’s dying before it logs anything.

Sometimes I get the message “[FATAL] failed to allocate memory” on
the console that I started the FCGI on. BTW, maybe it’s of interest: I’m
running Typo as an FCGI process using lighttpd. I mean, I start the FCGI
using “spawn-fcgi” and tell lighttpd where to find the socket.

Does it always die in the same place?

Not really. It seems that the problem occurs whenever Typo tries to load
a new page that isn’t in the cache, yet. Sometimes it works for a page
more, but once I try to load another page that isn’t in the cache (or so
I’m guessing) it blows up.

Urban

http://bettong.net - Urban’s Blog

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFFi/IQggNuVCIrEyURApMsAJ9ip4zLUfz2LptZZlmbGYeeaauyEQCglaqK
xaa+l2lgAXVCPjZliRD8Mu8=
=QkwX
-----END PGP SIGNATURE-----

Urban_H · December 23, 2006, 12:26pm

Urban H. [email protected] writes:

the console that I started the FCGI on. BTW, maybe it’s of interest: I’m
running Typo as an FCGI process using lighttpd. I mean, I start the FCGI
using “spawn-fcgi” and tell lighttpd where to find the socket.

Oh crap. Memory leak. I hate memory leaks.

Does it always die in the same place?

Not really. It seems that the problem occurs whenever Typo tries to load
a new page that isn’t in the cache, yet. Sometimes it works for a page
more, but once I try to load another page that isn’t in the cache (or so
I’m guessing) it blows up.

Bugger. Definitely a memory leak.

Urban_H · December 23, 2006, 3:06pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Dec 23, 2006, at 12:15 , Piers C. wrote:

Typo’s dying before it logs anything.
Does it always die in the same place?

Not really. It seems that the problem occurs whenever Typo tries
to load
a new page that isn’t in the cache, yet. Sometimes it works for a
page
more, but once I try to load another page that isn’t in the cache
(or so
I’m guessing) it blows up.

Bugger. Definitely a memory leak.

Seems like. BTW, thanks for all the help tracking down this bug!

Urban

http://bettong.net - Urban’s Blog

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (Darwin)

iD8DBQFFjTCzggNuVCIrEyURAgBcAJ9SrpiNF9luF7EE2BzTnQ6U4H7f5QCgo4b0
ah6L/FJwFKZQkZr/nz6u1/I=
=4tXB
-----END PGP SIGNATURE-----

Urban_H · December 24, 2006, 12:42am

Urban H. [email protected] writes:

On Dec 23, 2006, at 12:15 , Piers C. wrote:

Bugger. Definitely a memory leak.

Seems like. BTW, thanks for all the help tracking down this bug!

If only it were tracked down. Now we’ve got to find where we’re
leaking from, and that’s never fun.