Modern computer systems come in many forms ranging from small hand-held smartphones to huge clusters of very capable standalone computers. A typical computer will have many (2-8 typically, perhaps even more) autonomous processors called cores, inside a very small package utilizing billions of transistors. A general-purpose computer will have at least one such package. A package may also include other components of a computer system, giving rise to the term system-on-a-chip (SoC).
Multi-core processors typically share some fast memory in addition to their own. The core processors, their private memories, and the shared memory are tightly connected by a high-throughput interconnect in the same package. These special memories are designed to maximize the flow of bits between the processors and typically off-package main system memory. The package is a MIMD computer whose core processors, capable of working separately on independent workloads, are also able to process in parallel the same workload. Cores typically run a multi- tasked workload of numerous threads from many programs under OS control, providing true concurrency.
Each core processor in a package will utilize some low-latency multi-stage pipeline capable of centralized management of its instruction streams (much like old school big centralized computers but much more efficiently). It may have access to a sophisticated dynamic scheduling unit capable of issuing multiple instructions at a time and internally parallel process their operations based on data dependencies (a built-in dataflow unit). It likely also has access to SIMD instructions with significant hardware support to process sets of data operands efficiently. These special instructions can speed up some important workloads in scientific/engineering and commercial/consumer audio/video code/decode applications.