40

I have to copy files on a machine. And the data is immensely large. Now servers need to serve normally, and there are usually a particular range of busy hours on those. So is there a way to run such commands in a way that if server hits busy hours, it pauses process, and when it gets out of that range, it resumes it?

Intended-Result

cp src dst

if time between 9:00-14:00 pause process
After 14:00 resume cp command.
| improve this question | |
  • 22
    rsync can resume partial transfers – Thorbjørn Ravn Andersen Feb 21 '19 at 15:37
  • 2
    Do you need the actual data to be copied as a backup? If not, could you use cp -al to make a hardlink farm? Or use a filesystem that supports block-level reflinks with copy-on-write, using cp -a --reflink=auto? BTRFS and ZFS support that for copies within the same physical device. – Peter Cordes Feb 21 '19 at 17:05
  • 9
    Do any of the files in src change between 9:00 and 14:00? If so, simply pausing and resuming the cp process may result in corrupted files. It may be better to run rsync in combination with the timeout command. – Mark Plotnick Feb 21 '19 at 19:51
  • From and to where are the files being copied? Is this a virtual system? What is the source filesystem? What's the purpose of the copy? – Braiam Feb 24 '19 at 20:23
  • @Braiam Im using rsync, and copying files from remote unto local machine. I just used cp command as example here btw – Sollosa Feb 26 '19 at 6:35
12

Yes, you need to acquire the process id of the process to pause (via the ps command), then do:

$> kill -SIGSTOP <pid>

The process will then show up with Status "T" (in ps).

To continue, do a:

$> kill -CONT <pid>
| improve this answer | |
78

You can pause execution of a process by sending it a SIGSTOP signal and then later resume it by sending it a SIGCONT.

Assuming your workload is a single process (doesn't fork helpers running in background), you can use something like this:

# start copy in background, store pid
cp src dst &
echo "$!" >/var/run/bigcopy.pid

Then when busy time starts, send it a SIGSTOP:

# pause execution of bigcopy
kill -STOP "$(cat /var/run/bigcopy.pid)"

Later on, when the server is idle again, resume it.

# resume execution of bigcopy
kill -CONT "$(cat /var/run/bigcopy.pid)"

You will need to schedule this for specific times when you want it executed, you can use tools such as cron or systemd timers (or a variety of other similar tools) to get this scheduled. Instead of scheduling based on a time interval, you might choose to monitor the server (perhaps looking at load average, cpu usage or activity from server logs) to make a decision of when to pause/resume the copy.

You also need to manage the PID file (if you use one), make sure your copy is actually still running before pausing it, probably you'll want to clean up by removing the pidfile once the copy is finished, etc.

In other words, you need more around this to make a reliable, but the base idea of using these SIGSTOP and SIGCONT signals to pause/resume execution of a process seems to be what you're looking for.

| improve this answer | |
  • 7
  • 1
    Maybe add a reminder that you should be very careful that '/var/run/bigcopy.pid' still refers to the same process as you think it does. randomly stopping other processes on the system may not be desirable. I know of no safe way to ensure that the pid refers to the program you think it does though... – Evan Benn Feb 22 '19 at 2:27
  • @EvanBenn Yeah that's what I meant in a way with "make sure your copy is actually still running before pausing it" though your point is surely more explicit than that! Yeah checking PIDs is inherently race-y so it's sometimes not really possible to do it 100% reliably... – filbranden Feb 22 '19 at 3:02
  • @cat Not really, a process can't block SIGSTOP. See the link from the first comment: "SIGSTOP is a non-blockable signal like SIGKILL" (or just google it, you'll see that's the case.) – filbranden Feb 22 '19 at 3:42
76

Instead of suspending the process, you could also give it lower priority:

renice 19 "$pid"

will give it the lowest priority (highest niceness), so that process will yield the CPU to other processes that need it most of the time.

On Linux, the same can be done with I/O with ionice:

ionice -c idle -p "$pid"

Will put the process in the "idle" class, so that it will only get disk time when no other program has asked for disk I/O for a defined grace period.

| improve this answer | |
  • 22
    This is a typical case of an XY problem. The question was how to pause a process, but this does not answer the question. While indeed lowering the priority is the better approach to the actual problem, it does not answer the question. I would edit the question to also include how to pause a process and why pausing might be a problem (e.g. file could be edited while paused). – MechMK1 Feb 21 '19 at 16:47
  • 22
    @DavidStockinger, technically, this answer tells how to tell the OS to pause the process when it (the OS, CPU, I/O scheduler) is busy (even if it's for fractions of seconds at a time). How to suspend the process manually has already been covered in other answers. This solution doesn't address the problem of files being modified whilst they are being copied. – Stéphane Chazelas Feb 21 '19 at 16:59
  • 5
    Changing the I/O priority isn't always the best solution. If you're copying from spinning disks, you may still incur a seek before each high-priority request which you wouldn't incur if you completely paused the low-priority operation. – Mark Feb 21 '19 at 22:37
  • 2
    Lower priority does not even solve the problem. Even if the box is completely idle for a few seconds or minutes, that does not mean that a huge copy process which will evict everything from the filesystem cache is going to be unobtrusive. As soon as there's a load again, it's going to be very slow paging everything back in. – R.. GitHub STOP HELPING ICE Feb 22 '19 at 19:00
  • 2
    @DavidStockinger the preferred way of dealing with XY problems is to give the right solution, even if that's not what the question is asking for. When you know the approach described in the question is wrong, then a good answer doesn't give that wrong approach but instead proposes a better one. – terdon Feb 23 '19 at 15:00
8

Use rsync, forget about cp, for this scenario. there are params to limit bandwith, or can be killed/stoped and started later, in a way it will continue, where it left google rsync example/s

| improve this answer | |
3

If you are going to do it by interrupting the running process, I suggest playing with the Screen program. I haven't used Linux in a while, but IIRC just pausing the command and resuming it later leaves you pretty vulnerable, if you accidentally get logged off you won't be able to resume your session.

With screen I believe you can interrupt the session then detach it and log out. Later you can go back in and reattach to that session. You'd have to play with it a bit but it made sessions much more robust.

You can also log out and go home then log in remotely, reattach to the system y you started in the office and resume it for the evening, then pick it up again the next day at work.

| improve this answer | |
  • I'm already using tmux for tha. But I'm writing a script that would be self-aware or preferably environment-aware, so it stops if server gets high traf, and continue when it's normal. – Sollosa Feb 24 '19 at 13:42
0

If your shell supports it (almost all do), you can press ^Z (Ctrl+Z) to easily send a SIGTSTP signal to the foreground task, then continue it with fg (on foreground) or bg (on background).

If you do this on multiple tasks and want to return to them later, you can use jobs command, then return with fg/bg %#, where # is the number given in brackets on jobs.

Keep in mind that SIGTSTP is a bit different than SIGSTOP (which is used on all other answers), most importantly due to the fact that it can be ignored (but I didn't see a program ignore it other than sl). More details can be found on this answer on StackOverflow.

| improve this answer | |
  • Surprised that no answer mentioned this yet. – Ave Feb 25 '19 at 11:23
  • Ty Ave, I know this multitasking trick. But for that to happen, one needs be on terminal, whereas I was to build a script that'll do the job on its own, no matter if it takes days. – Sollosa Feb 26 '19 at 6:32
  • @Sollosa it can be useful to others with the same question, and with access to a terminal. – Ave Feb 27 '19 at 5:08
  • I agree. Nice knowing you Ave :) – Sollosa Feb 27 '19 at 11:34

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.