There are some files I need to work with, which are currently on a Linux computer. I wanted to transfer them over to my computers, Macs both, to work with them more easily. But somewhere in the process, the files are getting mangled in a very peculiar way, and I’d like to figure out how.
The original files are ASCII text, organized as a tab-delimited table. The first column is an index number, the second column is a radius value, the third column is a density, and so on. The files have no particular extension. When they get mangled, however, the result appears to be that each column of values ends up as its own block of text, followed by a block of text for the next column, and so on.
The first time I tried copying the files, I (sitting at the linux computer) did something like
scp -R * me@mycomputer:path/
The result was that all of the data files got mangled in the above-described manner. Once I realised that the files were getting mangled, I went back over to the Linux computer and did some experiments. If I copied over a single file, it still got mangled. If I copied over the file as-is, and then added the extension .txt at the other end (either as part of the copy command, or as a separate command afterwards), it still got mangled.
However, if I first re-named the file on the Linux box to have a .txt extension, and then transfered it over to the other computer, it was not mangled. Aha, I conclude, there must be some bug in scp, that it didn’t know what to do with this ASCII file that wasn’t called .txt . So I tar together all of the files on the Linux computer, and then use scp to transfer over the whole .tgz file. Surely, scp can’t mess that up. And yet, when I now expand that .tgz file back out, lo and behold, the data files are messed up, in the same way as before. Now, I know that it can’t have been scp’s fault, because it was a compressed tar file (and yes, I confirmed that it really was compressed), so scp couldn’t possibly have been doing anything to the innards of any particular file. Had it done anything untoward to the .tgz file, then it’d render it unopenable, or at least scrambled into something indistinguishable from random noise.
So now, I’m forced to conclude that either there’s the same weird bug involved in both scp and tar, or there’s something weird about the transition of these files from one system to the other. But what, and what can I do to fix it? Incidentally, any fix absolutely must be automated, since there are tens of thousands of separate files involved here.