Simplifying microprocessors

Simplifying microprocessors
What is the difference between Intel Core 2 duo and Intel Core 2 quad? Or for the matter, Intel Core 2 duo and Intel dual core? If you are not sure about the answers to these questions, then read on.

This article focuses on the microprocessors from the standpoint of Intel family of microprocessors. The college syllabus begins at Intel 8086 and stops at Pentium. Here we will start from Intel Pentium and move to Intel i7. Being a very vast and complex subject, I have done some real efforts to keep it as simple and small as possible. Hope you find it useful.

Fundamental difference

Many times, the meaning of words CPU and processor or microprocessor are taken literally. Some call CPU as complete box that contains processors, motherboard, hard disk, RAM, SMPS, etc. while others call it as equivalent to processor which we will use in this article.

When I say equivalent then I really mean it as equivalent but not exactly equal. The thing is every CPU is processor but vice a versa is not true. Every processor is not CPU. A processor can be GPU, DMA or anything i.e. processor can function as CPU, GPU, DMA, etc. We can say that CPU, GPU are conceptual terms and processor is actual real hardware that performs the designated function.

Core and thread of processor

I find it slightly difficult to articulate this distinction but probably this is the most central concept to modern processors. Each CPU is nothing but one small rectangular chip called as die. This die is made up of semiconducting material like silicon and can host or mount one or more CPU or processors on it. Essentially, die just acts as base material to support processor which is very thin in thickness. Without die, we won’t be able to hold the processor in our hand.

When there is a single processor fabricated on die it is said to have single core. When, there are more than one processors then it becomes multi-core system. Every processor needs cache (L1, L2, L3 cache), data bus; address bus, etc. to function properly. Considering the size limitation, different multi-core architectures have evolved where some things are kept dedicated to each processor while some things like L2 L3 cache, bus, etc. are shared. Below is one such architecture.

Multicore system

In above architecture, if there are two cores, then there will be two L1 caches and single L2 cache and single bus circuit that will be shared across two cores. Note that, each core is still a processor or CPU but in general, entire die is also referred to as one processor. This leads to significant confusion when marketing person says Intel Core 2 duo processor has 2 CPU. To avoid confusion, we can use simple analogy of ALU. Each processor has one central assembly called as ALU (Arithmetic Logic Unit) that carry out all the instructions. So in not so technically correct analogy, core can be called as ALU. So above architecture is one big processor that hosts two cores or ALUs. When we look from OS perspective, then even though there is just one physical chip or one big processor, OS kernel will see it as two entirely separate processors.

To understand thread, we will start with multitasking. Processor multitasking is the capability by which a single processor can run many programs simultaneously at once. For example, on Window 7 system, one can open word, excel, paint, etc. all at once. Here the word simultaneous is misleading. Single processor with single core still executes everything simultaneously but it is the speed of processors in GHz that greatly exceeds the speed of human giving illusion of simultaneous execution. Multitasking switches from one program to another in time sliced manner. When some process takes longer time to execute or waiting for some I/O operation, then multitasking kicks in to switch to another program thereby increasing overall performance.

Multitasking is the responsibility of both processor and OS. If your processor does not support multitasking then, there is no point of OS having multitasking capability and vice a versa. Theoretically, if you install plain old DOS on i7 processor, then DOS will not be able to use multitasking as DOS does not support it.

Multitasking exploits the parallelism at OS level. But what if, there is just one large application running and it needs to wait on I/O operation to complete. Processor won’t really do multitasking as it has no other application to serve to. But won’t it be great if we can have some mechanism where an application itself is split into several components that will execute simultaneously? So if some part of application goes in waiting mode for some reason, then processor can actually continue execution of other independent component. Here we are talking of parallelism at individual application level or extending multitasking at application level. This very idea is called as multi-threading.

Now, there are many types and implementation architectures of multi-threading. Instead of discussing those, we will just look at basic threading from conceptual point of view. First, the threading that came into picture were operating system level threads. It means if OS provides threading capability, then application can make use of those threading API provided by OS irrespective of processor support for threading. So, if processor does not have inbuilt threading then OS will either execute program without threads or it will simulate the threading environment for the program.

Obviously, this simulation at OS level threading is limited and performance gain is not significant as ultimate power horse processor on which is OS running has just got one core and one thread. Thus, next comes threading at processor level. So, a single core within a processor can execute more than one thread at a time. As you might have guess, just like multitasking, multithreading is responsibility of both processor and kernel. Multithreading is much difficult to implement than multitasking as multitasking deals at application level where each application has its separate address space (memory) while multithreading needs to deal at individual thread level who shares same memory with other threads.

So to sum up the story, core is actual physical core inside the processor whereas thread is logical core of the processor. Let’s do simple mathematics. If there is a processor with 4 cores whereas each core supports 2 threads then, then kernel of the OS thinks that there are actually 8 CPUs or cores (4 *2 = 8. Each core exposes itself as two cores). If you want to see how many threads are there in the processor then, open task manager in Windows 7 desktop and go to performance section. Select one graph per CPU option and you will see the magic. Below is the image of Intel Core i3 that has got 2 cores and each core supports 2 threads resulting in total 4 CPUs for the OS.

Core i3 logical view

Note that if I install Windows 95 or 98 on the i3 processor, then I will just see one CPU as these OS don’t support multi-threading capability.

First revolution: Intel Pentium 4 HT

All Intel processors from 8085, 80186… till Intel Pentium 4, processor architectures are very simple. There is single die with single core and no multi-threading. Thus, we will not get into details of these processors.

In November 2002, Intel introduced something that nobody had ever done. Intel introduced first ever commercial implementation of multithreading technology. Intel branded this technology as Hyper Threading Technology. First processor in this line was Intel Pentium 4 HT. As said earlier, it was one big processor mounted on one die with one core (processing unit) but two threads i.e. one physical CPU will work as two logical CPUs. If you open task manager then you will see something like:

Pentium 4 HT logical view

Monument of problems

As demand for more speed increased, new problems that nobody, even great Dr. Moore, foresaw started to appear on surface. These are problem of electronics and not computer engineer thus might be too much for us. In short, complexity of systems started to increase exponentially and so is the requirement for speed and performance. Initially, Intel thought of increasing speed of processors to 10GHz. But it never happened. As far as I remember, the max speed reached was 3.8GHz. This number is also known by thermal limit.

The problem was the heat. As Intel and AMD started putting more and more transistors on single core, transistor power leakage problem intensified resulting into tremendous heat that can damage entire system permanently. If you search YouTube, you will find videos where processors start burning itself within minute or so if you remove the giant cooling fan that rests on top of processor.

So this road was dead end. The year was 2003 meaning Nano-tech was not too advanced to create multicore systems. So the only way was to create a processor with multiple dies where each die will have single processor or core on it.

New Multiprocessor system: Intel Pentium D

Intel Pentium D had two die, each containing a single core and both dies would reside on a single rectangular chip next to each other. The OS task manager view would be same as that of Intel Pentium 4 HT processor. Intel Pentium D was true multiprocessor system.

As you might guess, this also suffered from power problem and heat problem. Two entirely joint yet disjoint processors would consume twice the power as that of normal processor. And as heat is directly proportional to power consumer, we can imagine the heat it used to produce. Plus there was again a physical limit of increasing the dies.

Evolving from Multiprocessor to Multicore systems

At the start of 2006, Intel introduced another evolution in technology. Instead of increasing clock speed or number of dies, Intel devised a third way of increasing cores of the processor. So a processor would have single die but multiple cores which is what we call it as modern day processors.

The very first series out of shop was Intel Core series. Now, you must have noticed that whenever I use Intel Core word, the C in Core is always capital but in case of multicore word, word c is small. The reason is Intel Core is the trademark of Intel Inc. for their all multicore processors. So, AMD now cannot use something like AMD Core XX.

Two processors that were included in Intel Core series were Intel Core Solo and Intel Core Duo. They both have one die and two cores. But in case of Solo, Intel disables one of the core during manufacturing. Why does Intel do this? Because it allows Intel to use same manufacturing line for both processors and hence save on capital. Frankly speaking, Solo is sold as low cost processor and is not so popular and in India, anyone would hardly have seen this. The clock speed of Core series is very less as compared to Pentium series but it achieves much better performance than Pentium because of multicore architecture.

Now you might wonder what is the difference between Intel Core Duo and Intel Core 2 Duo? Or between Intel Core Solo and Intel Core 2 Solo. Here is the trick to understanding processor as shown in below image:

Understanding Intel family

The word “Intel Core” means it is Multicore system. Presence of 2 means it is 64 bit microprocessor whereas absence means it is 32 bit system. The number after 2 is Solo, Duo or Quad that indicates number of cores present in the processor.

So when we are talking about Intel Core 2 Duo it means it 64 bit processor with 2 cores and 2 threads. Intel Core 2 Quad means it is 64 bit processor with 4 core and 4 threads. Intel Core 2 Quad is also nicknamed as Intel Quad Core to signify this was the first processor in the world with highest number of cores. Note that Intel Core 2 Solo was never available for desktop computers. It was only for ultra-low-power notebooks and laptops.

Intel has also introduced another series called as Intel Dual Core series. These are nothing but step down counterparts of Intel Core 2 series. Typically they have less L2 cache, low clock speed and many features like Intel VT (virtualization technology) are absent. So, Intel Core 2 and Intel Dual Core both are multiprocessor architectures

Intel core iX series

Intel’s Core 2 brand was superseded by new Intel core iX series brand. The processors are i3, i5 and i7. These names are just marketing brand hype. There is no intelligent naming convention as we saw in case of Intel Core 2 and Intel Core family. The numbers 3, 5 and 7 shows the relative power of one processor above the other. The letter “i” has no significant meaning as such.

Intel i3 processor has 2 cores and with each core having 2 threads. This is where Intel reintroduces the HT technology that it introduced way back with Pentium 4 HT. This makes it equal to Intel Core 2 Quad from logical perspective. But, Intel Core 2 Quad has slightly better speed compared to i3 as there is just 1 thread per core while, i3 has 2 threads per core. Nevertheless, as ordinary user, the performance is not an issue as such which is counterbalanced by less noise and much low power consumption.

Intel i5 is similar to Quad with 4 cores and 4 thread but better power rating with increased speed. There is a laptop variation of i5 for low power requirements where there are just 2 cores. Intel i7 is processor with 4 cores and 8 threads i.e. 2 threads per core. High end variation of Intel i7 extreme has 6 cores with 12 threads.

Points to note

This is how we can broadly describe and classify microprocessors on the basis of cores and threads. However, there are many other Intel processors that we have skipped like Itanium and Xenon. Then, Intel has competitor AMD family which would be addressed in separate post. If AMD and Intel are not enough then, there is new child on horizon i.e. ARM the heart of modern smartphones. It is where the science of computer hardware in microprocessor domain gets really nasty and political which would be addressed in again separate article. And finally, the Intel family can be summarized in below info-graphic.

Intel family of modern processors