The first thing to understand about SSH and file transfers, is that SSH doesn't do file transfers.
Ahem.
Now that we have your attention, what can we possibly mean by that? After all, there are entire sections of this book dedicated to explaining how to use scp1, scp2, and sftp for file transfers.
What we mean is that there is nothing in the SSH protocol about transferring files: an SSH speaker can't ask its partner to send or receive a file through the protocol. And the programs we just mentioned don't actually implement the SSH protocol themselves nor incorporate any security features at all. Instead, they actually run the SSH client in a subprocess, in order to connect to the remote host and run the other half of the file-transfer process there. There is nothing very SSH- specific about these programs; they use SSH in much the same way as do other applications we cover, such as CVS and Pine.
The only reason it was necessary to come up with scp1 in the first place was that there was no widely used, general-purpose file-transfer protocol available that operated over a the single, full- duplex byte stream connection provided by the SSH remote program execution. If existing FTP implementations could easily be made to operate over SSH, there would be no need for ssh, but as we'll see, FTP is entirely unsuited to this. [Section 11.2] So Tatu Ylửnen wrote scp1 and made it part of SSH1. The protocol it uses (let's call it "SCP1") remained entirely undocumented, even when Ylửnen wrote the first RFC documenting the SSH-1 protocol.
Later, when SSH Communications Security was writing SSH2, they wanted to continue to include a file-transfer tool. They stayed with the model of layering it on top of SSH proper, but decided to entirely reimplement it. Thus, they replaced the "scp1 protocol" with the "SFTP protocol," as it is commonly known. The SFTP protocol is again simply a way to do bidirectional file transfers over a single, reliable, full-duplex byte stream connection. It happens to be based on the same packet protocol used as the substrate for the SSH Connection Protocol, presumably as a matter of convenience. The implementers already had a tool available for sending record-oriented messages over a byte pipe, so they reused it. SFTP remains an undocumented, proprietary protocol at press time, though there is work beginning in the IETF SECSH working group to document and standardize it.
The name SFTP is really unfortunate, because it confuses people on a number of levels. Most take it to stand for "Secure FTP." First, just as with scp1, as a protocol it isn't secure at all; the
implementation derives its security by speaking the protocol over an SSH connection. And second, it has nothing whatsoever to do with the FTP protocol. It is a common mistake to think you can somehow use SFTP to talk securely to an FTP server—a reasonable enough supposition, given the name.
Another confusing aspect of file transfer in SSH2, is the relationship among the two programs scp2 and sftp, and the SFTP protocol. In SSH1, there is a single file-transfer protocol, SCP1, and a
TE AM FL Y
Team-Fly®
single program embodying it: scp1. In SSH2, there is also a single, new file-transfer protocol:
SFTP. But there are three separate programs implementing it and two different clients. The server side is the program sftp-server. The two clients are scp2 and sftp. scp2 and sftp are simply two different front-ends for the same process: each runs the SSH2 client in a subprocess to start and speak to sftp-server on the remote host. They merely provide different user interfaces: scp2 is more like the traditional rcp, and sftp is deliberately similar to an FTP client.
None of this confusing terminology is made any easier by the fact that both SSH1 and SSH2 when installed make symbolic links allowing you to use the plain names "scp," "ssh," etc., instead of
"scp1" or "ssh2." When we speak of the two SSH-related file-transfer protocols, we call them the SCP1 and SFTP protocols. SCP1 is sometimes also just called the "scp" protocol, which is technically ambiguous but usually understood. We suppose you could refer to SFTP as the "scp2 protocol," but we've never heard it and don't recommend it if you want to keep your sanity.[21]
[21] Especially since scp2 may run scp1 for SSH1 compatibility! Oy gevalt!
3.8.1 scp1 Details
When you run scp1 to copy a file from client to server, it invokes ssh1 like this:
ssh -x -a -o "FallBackToRsh no" -o "ClearAllForwardings yes" server- host scp ...
This runs another copy of scp on the remote host. That copy is invoked with the undocumented switches -t and -f (for "to" and "from"), putting it into SCP1 server mode. This next table shows some examples; Figure 3-6 shows the details.
This client scp command: Runs this remote command:
scp foo server:bar scp -t bar scp server:bar foo scp -f bar scp *.txt server:dir scp -d -t dir
Figure 3.6. scp1 operation
If you run scp1 to copy a file between two remote hosts, it simply executes another scp1 client on the source host to copy the file to the target. For example, this command:
scp1 source:music.au target:playme
71 runs this in the background:
ssh1 -x -a ... as above ... source scp1 music.au target:playme
3.8.2 scp2/sftp Details
When you run scp2 or sftp, they run ssh2 behind the scenes, using this command:
ssh2 -x -a -o passwordprompt "%U@%H\'s password:"
-o "nodelay yes"
-o "authenticationnotify yes"
server host -s sftp
Unlike scp1, here the command doesn't vary depending on the direction or type of file transfer; all the necessary information is carried inside the SFTP protocol.
Note that they don't start sftp-server with a remote command, but rather with the SSH2
"subsystem" mechanism via the -s sftp option. [Section 5.7] This means that the SSH2 server must be configured to handle this subsystem, with a line like this in /etc/sshd2_config:
subsystem-sftp /usr/local/sbin/sftp-server
Assuming the ssh2 command succeeds, sftp and sftp-server start speaking the SFTP protocol over the SSH session, and the user can send and retrieve files. Figure 3-7 shows the details.
Figure 3.7. scp2/sftp operation
Our testing shows roughly a factor-of-four reduction in throughput from scp1 to scp2. We observe that the SFTP mechanism uses the SSH packet protocol twice, one encapsulated inside the other:
the SFTP protocol itself uses the packet protocol as its basis, and that runs on top of an SSH session. While this is certainly inefficient, it seems unlikely to be the reason for such a dramatic reduction in performance; perhaps there are simply implementation problems that can be fixed, such as bad interactions between buffering in different layers of the protocol code. We have not dug into the code ourselves to find a reason for the slowdown.