Just to clear up, the 30/60/120/240hz are all remnants of processing for film and video playback (there is also a 24hz, that wasn't mentioned). This all comes down to fundamental math and the film and video standards that existed. Film was mainly shot at 24hz partly as it was seen as good enough to capture the motion but also slow enough that it saved on cost of the film itself (going faster meant needing more film per minute). TV broadcast initially used 30hz, but fairly quickly changed to 60hz (in the USA market, Europe and some other countries used 25hz and 50hz respectively).
The 120hz was an interesting number because it was the first intersection of all the 3 main formats of video in the US market 24hz, 30hz, and 60hz video could all be displayed on a monitor that could provide 120hz refresh and processing rate without any changing of the pacing of the original content (i.e at 120hz refresh rate, and 24hz film would simply show each frame 5 times, 30hz video would have its frames shown 4 times, and 60hz video would have its frames shown 2 times). Previous monitors and TV sets that attempted to show 24hz film usually suffered from problems as they were typically 60hz refresh rates, leading to a stutter in the film's pacing using 3:2 pulldown processing where-in the first frame was shown for 3 frames, and the next frame was shown for 2 frames (this mathematically converted the 24 frames per second into 60 as (24/2*3)+(24/2*2)=60). But as you can see that created a jerky feel to the film compared to its original since 1/2 the frames are shown at one speed and the other 1/2 are shown for 50% shorter amount of time. The 120hz removed that stuttering. This same condition appears again at 240hz, as everything is just doubled again over the 120hz values.
For computer displays, none of this really is an issue outside of reproducing/playing back those video forms using the computer. For generated graphics, simply having a display that could keep up with whatever your graphics card could generate is the ideal display. There are also diminishing returns after a certain point as the human eye and brain can process only a certain amount frames per second. Some of the latest studies have shown the for conscious interpretation of images, the limit is about 13 milliseconds that the image needs to be displayed on a screen for a person to be able to remember seeing it and know what the picture was. This translates loosely to ~77hz. But also remember that the brain and eye did not evolve looking for something that would mysteriously appear/disappear like an image on a screen, but instead based on object recognition and translation/movement of those objects from an existing state to a new state. This is why studies have shown we can subconsciously process and recognize some images even faster (especially images of things like human faces, and even more so of faces of people we are familiar with such as close family members). But this still leads back to that once you hit that 120hz mark, a person has a very hard time obtaining additional benefit of a faster display (we are already pushing the limits of reaction times at that point as well).