I am running my shell script on machineA which copies the files from machineB and machineC to machineA.
If the file is not there in machineB, then it should be there in machineC for sure. So I will try to copy file from machineB first, if it is not there in machineB then I will go to machineC to copy the same files.
In machineB and machineC there will be a folder like this YYYYMMDD inside this folder -
/data/pe_t1_snapshot
So whatever date is the latest date in this format YYYYMMDD inside the above folder - I will pick that folder as the full path from where I need to start copying the files -
so suppose if this is the latest date folder 20140317 inside /data/pe_t1_snapshot then this will be the full path for me -
/data/pe_t1_snapshot/20140317
from where I need to start copying the files in machineB and machineC. I need to copy around 400 files in machineA from machineB and machineC and each file size is 2.5 GB.
Earlier, I was trying to copy the files one by one in machineA which is really slow. Is there any way, I can copy "three" files at once in machineA using threads in bash shell script?
Below is my shell script which copies the file one by one in machineA from machineB and machineC.
#!/usr/bin/env bash
readonly PRIMARY=/export/home/david/dist/primary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/data/pe_t1_snapshot
PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280 12 552 284 16 256 564 20 260 560 24 264 572) # this will have more file numbers around 200
dir1=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[0]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
dir2=$(ssh -o "StrictHostKeyChecking no" david@${FILERS_LOCATION[1]} ls -dt1 "$MEMORY_MAPPED_LOCATION"/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9] | head -n1)
## Build your list of filenames before the loop.
for n in "${PRIMARY_PARTITION[@]}"
do
primary_files="$primary_files :$dir1"/t1_weekly_1680_"$n"_200003_5.data
done
if [ "$dir1" = "$dir2" ]
then
find "$PRIMARY" -mindepth 1 -delete
rsync -avz david@${FILERS_LOCATION[0]}"${primary_files}" $PRIMARY/ 2>/dev/null
rsync -avz david@${FILERS_LOCATION[1]}"${primary_files}" $PRIMARY/ 2>/dev/null
fi
So I am thinking instead of copying one file at a time, why not just copy "three" files at once and as soon these three files are done, I will move to another three files in the list to copy at same time?
I tried opening three putty instances and was copying one file from those three instances at the same time. All the three files were copied in ~50 seconds so that was fast for me. And because of this reason, I am trying to copy three files at once instead of one file at a time.
Is this possible to do? If yes, then can anyone provide an example on this? I just wanted to give a shot and see how this is working out.
@terdon helped me with the above solution but I wanted to try copying three files at once to see how it will behave.
Update:-
Below is the simplified version of the above shell script. It will try to copy files from machineB and machineC into machineA as I am running the below shell script on machineA. It will to try copy file numbers which are present in PRIMARY_PARTITION.
#!/usr/bin/env bash
readonly PRIMARY=/export/home/david/dist/primary
readonly FILERS_LOCATION=(machineB machineC)
readonly MEMORY_MAPPED_LOCATION=/data/pe_t1_snapshot
PRIMARY_PARTITION=(0 548 272 4 544 276 8 556 280 12 552 284 16 256 564 20 260 560 24 264 572) # this will have more file numbers around 200
dir1=/data/pe_t1_snapshot/20140414
dir2=/data/pe_t1_snapshot/20140414
## Build your list of filenames before the loop.
for n in "${PRIMARY_PARTITION[@]}"
do
primary_files="$primary_files :$dir1"/t1_weekly_1680_"$n"_200003_5.data
done
if [ "$dir1" = "$dir2" ]
then
# delete the files first and then copy it.
find "$PRIMARY" -mindepth 1 -delete
rsync -avz david@${FILERS_LOCATION[0]}"${primary_files}" $PRIMARY/
rsync -avz david@${FILERS_LOCATION[1]}"${primary_files}" $PRIMARY/
fi
lsoutput. – l0b0 Apr 21 '14 at 08:41readonlydeclarations... – l0b0 Apr 21 '14 at 08:45machineBandmachineCintomachineAso I need to download three files at a time frommachineBandmachineCinto machine. If the files are not there inmachineBthen it should be there inmachineCfor sure, So I will try to copy file frommachineBfirst, if it is not there inmachineBthen I will go tomachineCto copy the same files. Let me know if anything is not clear. – arsenal Apr 21 '14 at 18:51