I have a custom capistrano recipe that runs commands on a couple of
remote hosts, using publickey ssh, and has been working for several
months. Recently (in the last week or so), something changed I have not
been able to identify, and the Net::SSH.start() method raise
AuthenticationFailed for only one of the remote hosts. The capistrano
recipe has not changed, the net-ssh gem is 2.0.8 and was updated long
ago. I can get the same error using either capistrano, or direct
Net::SSH.start within IRB.
I use ubuntu 8.04 on all computers involved, and follow regular security
updates, but have not been able to track which one might have caused
this, and why only one remote machine is affected.
Also, normal command-line ‘ssh’ works correctly every time for all
hosts.
To try and track this, I’ve done a series of tests connecting to two
remote machines, one working and one not, and comparing the differences,
both with command-line ssh and with Net::SSH with high debug levels set.
Most differences found were expected, like different host keys, but the
sequence of events was always the same, with one exception, the debug
output from Net::SSH showed a sequence of messages and responses, and
message 5 differed. Here follows a more explicit description of the
test:
Given that ‘ssh -l userx goodhost’ and ‘ssh -l userx badhost’ both work
correctly and identically with publickey authentication, we get the
following results in IRB (edited for clarity):
Net::SSH.start(‘goodhost’,‘userx’,{:verbose => Logger::DEBUG})
…
trying publickey
queueing packet nr 5 type 50 len 508
received packet nr 5 type 60 len 460
queueing packet nr 6 type 50 len 556
received packet nr 6 type 52 len 12
publickey succeeded
Net::SSH.start(‘badhost’,‘userx’,{:verbose => Logger::DEBUG})
…
trying publickey
queueing packet nr 5 type 50 len 508
received packet nr 5 type 51 len 44
all authorization methods failed
So, it seems packet 50 should receive 60 in response, but gets 51
instead. I have no idea what these numbers mean, and why different
responses are received by Net::SSH, when the command-line ssh works
fine.
As said before, this is a problem that has suddenly happened, with no
obvious change to the computer configurations, and all computers are
identically configured (with regards to ssh, ruby and Net::SSH).
Does anyone have any ideas, or further suggestions on where to look?
P.S. Things I’ve already tried that have not helped:
- removing ssh gateway from my configs
- downgrading net-ssh gem all the way back to 2.0.1
- run command-line ssh with maximum verbocity to find differences