concurrency vs. parallelism

Concurrency vs. Parallelism

ConcurrencyParallelism是有关系但却又是不同的2个概念:

  1. Concurrency: 并发,多个线程同时处理多个任务,多个线程之间有状态共享或者资源共享。
    • 如2个线程竞争CPU。
    • 如2个线程竞争同一个对象锁。
    • 多个GC线程垃圾回收时,对可回收对象进行并发标记,最后并发清理。
    • 并发相对于Serial执行而言,如Serial GC回收。
    • 可以是单核CPU上进行多线程并发执行。
  2. Parallelism: 并行,多个线程各做各的事情,多线程间互相间无共享状态。
    • 一个大文件被切分成10个小文件,然后对这个10个小文件并行分析处理。
    • Linus Torvalds还专门在吐槽并行就是未来时提到过,并行计算在GPU图形处理和服务器端上是有用的。
    • 并行重点在于不同线程之间的独立性,无关性。
    • 必须要求是多核CPU。

关于并行计算的一些观点

Rob Pike

Rob Pike在他的演讲-Concurrency is not Parallelism中讨论了二者的关系和区别,最后结论摘录如下:

Difference

  1. Concurrency is about dealing with lots of things at once.
  2. Parallelism is about doing lots of things at once.
  3. Not the same, but related.
  4. One is about structure, one is about execution.
  5. Concurrency provides a way to structure a solution to solve a problem that may (but not necessarily) be parallelizable.

Conclusion

  1. Concurrency is powerful.
  2. Concurrency is not parallelism.
  3. Concurrency enables parallelism.
  4. Concurrency makes parallelism (and scaling and everything else) easy.

Jakob Jenkov

Concurrency is related to how an application handles multiple tasks it works on. An application may process one task at at time (sequentially) or work on multiple tasks at the same time (concurrently).

Parallelism on the other hand, is related to how an application handles each individual task. An application may process the task serially from start to end, or split the task up into subtasks which can be completed in parallel.

An application can be concurrent, but not parallel. This means that it processes more than one task at the same time, but the tasks are not broken down into subtasks.

An application can also be parallel but not concurrent. This means that the application only works on one task at a time, and this task is broken down into subtasks which can be processed in parallel.

Additionally, an application can be neither concurrent nor parallel. This means that it works on only one task at a time, and the task is never broken down into subtasks for parallel execution.

Linus Torvalds

I can imagine people actually using 60 cores in the server space, yes. I don't think we'll necessarily see it happen on a huge scale, though. It's probably more effective to make bigger caches and integrate more of the IO on the server side too.

On the client side, there are certainly still workstation loads etc that can use 16 cores, and I guess graphics professionals will be able to do their photoshop and video editing faster. But that's a pretty small market in the big picture. There's a reason why desktops are actually shrinking.

So the bulk of the market is probably more in that "four cores and lots of integration, and make it cheap and low-power" market.

But hey, predicting is hard. Especially the future. We'll see.

haskell.org

The term Parallelism refers to techniques to make programs faster by performing several computations in parallel. This requires hardware with multiple processing units. In many cases the sub-computations are of the same structure, but this is not necessary. Graphic computations on a GPU are parallelism. Key problem of parallelism is to reduce data dependencies in order to be able to perform computations on independent computation units with minimal communication between them. To this end it can be even an advantage to do the same computation twice on different units.

The term Concurrency refers to techniques that make program more usable. Concurrency can be implemented and is used a lot on single processing units, nonetheless it may benefit from multiple processing units with respect to speed. If an operating system is called a multi-tasking operating system, this is a synonym for supporting concurrency. If you can load multiple documents simultaneously in the tabs of your browser and you can still open menus and perform more actions, this is concurrency.

If you run distributed-net computations in the background while working with interactive applications in the foreground, that is concurrency. On the other hand dividing a task into packets that can be computed via distributed-net clients, this is parallelism.

References

  1. Concurrency is not Parallelism - Rob Pike
  2. Concurrency vs Parallelism - What is the difference?
  3. The difference between concurrent and parallel execution?
  4. Concurrency vs. Parallelism
  5. Parallelism vs. Concurrency
  6. Concurrency is not Parallelism
  7. Linus: The Whole "Parallel Computing Is The Future" Is A Bunch Of Crock