On Mar 10, 2006, at 7:49 AM, Adam F. wrote:
Well, not really. Since connection pools are available intraprocess,
- Non-persistent connections, each client opens its own connection,
and closes it when it’s done. Basically 1:1 clients to
connections.
Common examples: CGI and mod_perl without Apache::DBI
- Persistent connections, each client opens its own connection, but
if there’s one already open, that’s used. Definitely 1:1 clients to
connections, and in fact this can be worse, because clients are
probably holding open persistent connections even if they’re not
using them. This may make sense if the connection overhead is
large.
Common examples: mod_perl with Apache::DBI
Additionally, I believe you’re throwing in the assumption that app
connections are equal to HTTP connections.
- “Real” DB pooling, where the application server / connection
manager hands out db connections as needed for each query and
reclaims them back to the pool when they’re done. There’s basically
no relation between the number of connections and the number of
client users.
Common examples: Java
- “Thread”-based connections, where each application thread (or
process, it’s not important how it’s implemented) gets its own
connection. Many client users may share an application thread
connection.
Yes, and only dynamic requests get handled by the process that holds
the DB connection open, and the HTTP connection to the client is NOT
handled by the application thread.
The first two are basic PHP models, #3 is how Java does it if you’re
doing it right, and if I’m understanding it, #4 is the rails method.
Correct?
Yes, with clarifying comments.
ration at a few thousand to one. Certainly you wouldn’t run the same
few thousand users with a pool of one connection, would you?
No - but that’s a limitation of the thread connection model. Do I need
to spawn a whole new fcgi process if I saturate my db (or any other
external resource, for that matter)? That seems like an inefficient
shotgun scaling mechanism.
I don’t understand you here. The whole new FCGI process is spawned in
advanced, and tuned over time to match traffic patterns. Generally
speaking in the high ends of scalability we’re discussing, you’d add
FCGI processes by plugging in a new application server.
exhausting the number of possible connections.
under normal conditions is about 4,000, although I’ve been able to
push it to around 10,000), you’re probably pushing static and/or
cached files out from somewhere else. Either a CDN, or some sort of
front-end caching mechanism.
Yes, that’s the front end proxy servers in the drawing below…
On 16 webservers, each of which does the default max of 256 apache
processes (or 4 webservers if you push it to 1024), that’s enough to
crash the database under heavy load. Granted, this is larger than the
average application, but that’s why they call it “scaling”.
But these connections are the client side HTTP connections, and DO NOT
have DB connections. 
I’ve seen cases where even static files are invoking the php engine
because of misconfiguration, and that obviously is a problem that
compounds this. No idea if that’s a possible mistake to make in
rails. I hope not.
Well, you know what they say about pigs…lipstick doesn’t help. 
significant
performance increase caused by connection pooling comes from
maintaining
the connection as opposed to reducing resource utilization.
If your connection overhead is high, as it is with Oracle. This is
MUCH less of an issue with mysql, and in fact, you’ll often get better
performance by not using persistent connections and letting the
connections cycle away and not be held beyond when they’re needed.
I find it very hard to believe that connecting per request is more
efficient that persistent connections, though I do agree that MySQL
connects incredibly faster than Oracle, and would therefore show a
much smaller improvement -vs- Oracle.
But, I can only imagine that’s faster if you’re opening connections
that aren’t going to be used. If that’s the case, then all bets are
off.
The reason for this is the latency and bandwidth limits between the
web client and the HTTP server plus the fact that many of those
connections require NO App Server utilization (static files and
images) combine to produce resource contention in the first stage
that are between 8 and 16 times the resource contention the 2nd and
third combined.
It depends on what kind of queries your running and how long they take
on average, but okay for the general case.
Oh, absolutely! The ratio needs to be tuned per-application, and is
dependent
on many variables, perhaps the most volatile of which is the
application code
itself. Efficient apps will have higher HTTP/app ratios than
inefficient apps.
likely to be in the 30-60 ms range, while the intra-farm packet
These are the reasons why you can handle more than one client side
connections per application process.
Latency is not the only issue though - this equation is heavily
dependent on the efficiency of your backend queries, how long they
take to run, and how they stack up with respect to each other than the
db’s own resource contention algorithms. For example, if you’re doing
a lot of big file uploads, you may be holding open db connections for
the whole length of that connection, which may be minutes, even if you
only use the db at the very beginning to record the transaction.
Yes, again, there are many complex variables that play into the
equation.
The reason I mentioned latency is that I’ve found again in again in
career
that serial latencies are more often the cause of performance problems
performance resource limitations, though the latencies often cause a
particular architecture to be starved for resources. 
I find again and again that many people have a hard time conceptualizing
how fast computers are. They think several milliseconds is incredibly
fast, where to a modern computer it’s incredibly slow. I cannot tell you
how many times I’ve seen applications running slowly, but having little
to no disk and/or processor utilization issues due to serialized
latencies,
and people suggesting that throwing more and/or faster hardware at it
was
going to fix the problem. 
Perhaps what you say is true for many apps, but I’m also interested in
the difficult boundary cases. 
Me too. While I gave some specific examples of the most common issues,
we’re in agreement that there’s more than one way to cause an
application
to perform poorly. 
Again, just remember the pig!
–
– Tom M.