Coming Up
In this video I will look at how we measure data. If you have ever purchased a hard disk and noticed a difference between the amount of space printed on the box and how much space is reported by Windows, I will have a look at why this occurs. I will also look at how to read the two different systems used to measure the size of data.
Following this, I will look at transfer speed. This is how fast data gets transferred from one point to another. Sometimes this will include overhead, which does not give a true indication of how fast data is being transferred, so I will look at how to interpret these values so you can get a true indication of what kind of speed to expect.
This video covers computer fundamentals. Although you are unlikely to be asked a question in the exam on anything covered in this video, when working with computers it is important to understand the concepts covered in this video.
Bits and Bytes
To start with I will look at bits and bytes. Bits and bytes make up the smallest data components inside a computer. Bits are switches and thus can be on or off. If the bit is on, the value is a one, if the switch is off the value is a zero. A bit forms the smallest unit of data in a computer; however, bits are grouped together to make it easier for the computer to address where data is.
The smallest unit of data inside a computer is generally a byte. A byte is 8 bits grouped together and in most computers is the smallest block of data that can be read from or written to. If the computer needs to change a bit inside the byte, the computer essentially has to read the whole byte, change the bit and then write the byte back. Thus, a bit forms the smallest changeable unit in a computer, but the byte is the smallest block of data that can be accessed – kind of like buying a carton of eggs, where you may only want one egg, but you need to buy the whole carton. Eggs are sold in cartons because it is more efficient to do it this way. Just like eggs, data is stored in bytes because it is more efficient to do it this way.
If you go back to the early days of computing, different data sizes were tried; however, the byte became popular because a single character fits nicely in a byte with some space to spare. Thus, a byte became the basic building block used in computers.
Let’s now have a look at what happens when we combine multiple bytes together.
Multiple-Byte Units
This is where things can get a little confusing if you don’t know how to interpret what you are looking at. The international System of Units or SI defines a standard using 1000 as a multiplier. The standard essentially defines Kilo as 1000, Mega as a million, Giga as a billion and so forth. In the case of digital information, the SI standard uses bytes as the postfix for digital information. Thus, it defines Kilobyte, Megabyte, Gigabyte, Terabyte, Petabyte, Exabyte, Zettabyte and Yottabyte.
The SI standard was around before the first computers, however, there was a problem when the computer was first developed. The problem was that since computers work using switches, that is, a zero or one, everything in a computer is based off the power of two, because a switch can either be a one or a zero. Thus, a kilobyte in SI units technically should be 1000, but 1000 is not a power of two. 1024 is a power of two and being close enough was a good enough approach with the first computers. This was not a problem when computers did not have a lot of storage, but as storage increased this started to cause problems.
To make it clearer how much digital storage we are referring to, in 1998 the International Electrotechnical Commission or IEC released a new standard based on 1024 as the multiplier rather than 1000. This changed the prefix used when referring to 1024 rather than 1000. This gives us Kibibyte, Mebibyte, Gigibyte, Tebibyte, Pebibyte, Exbibyte, Zebibyte and Yobibyte.
The IEC released this standard so it would be clear if 1000 or 1024 was being used as a multiplier. This is important as storage gets larger, because if you switch between the two different standards the difference becomes larger and larger as the storage gets bigger and bigger. Let’s have a closer look.
M.2 Storage Example
For this example, I will look at a two Terabyte M.2 Drive. To understand a little better how much space I can expect to get, I will have a closer look at the specifications on the manufacturer’s website. The space on the website is given as up to two Terabytes. So, the storage to be expected will be up to two Terabytes and no more, although it could also be less than two Terabytes. The reason it could be lower is if some of the space fails testing. You may not get the full two Terabytes, but it should be pretty close.
To understand how much data two Terabytes is, the website defines one Gigabyte as one billion bytes. I installed this M.2 storage in a computer running Windows 11 and the capacity was reported as 1.8 Terabytes.
Microsoft, since the Dos days, used a multiplier of 1024. Back in the day this was acceptable since there was no standard back then. This caused some confusion as time went on because often manufacturers of storage would use 1000 for a multiplier rather than 1024. We can only assume they do this as it allows them to put a bigger storage capacity on the product box. Windows, in most cases, still uses the legacy standard and thus the storage space will appear to be under reported.
To understand this a bit better, I will open the properties for the storage. In properties, the capacity is reported as about one hundred thousand under two million Megabytes. Two million Megabytes would be 2 Terabytes, so this is pretty close.
When you are using Windows, you need to keep in mind that the storage in some cases may get reported differently depending on which system of units is being used. There is a way to make this a bit clearer, however, Windows has not yet started to adopt this.
IEC Units
Nowadays, SI units should use 1000 and IEC should use 1024. However, as we have seen in systems like Windows, legacy units are still used. That is, a legacy name uses the same name as the SI units; however, it uses a base of 1024 rather than 1000.
The IEC standard, other than adding a new name for each unit of measurement, also added a lower-case i in the abbreviation. Thus, when you see the extra i, this indicates 1024 is being used as the multiplier rather than 1000.
A lot of modern operating systems have changed over to IEU units. For example, if I look at the same storage device in the same computer running the Linux system Ubuntu, you will notice that storage is reported as 1.82 Tebibytes.
This number matches what was reported in Windows. Thus, if you are using Windows or very old software or an old operating system, don’t assume what units it is unless it is using IEC units. If the value is important to your calculation, double check the value.
In the case of Windows, I would suggest to Microsoft when the old legacy unit is being used, simply change it to the IEC unit. This would clear up a lot of the confusion. Given that Windows 11 still uses legacy units, this does not seem to be a priority for Microsoft.
Before I move on to look at network transfer speeds, I first want to look at some terminology that can be confusing when working with storage and transfer speeds.
b vs B
When looking at the abbreviations used with storage and data transfers, lower-case b will be used for bits. Generally, bits will be used to measure data transfer. Bytes will use an upper-case B and will be used for bytes. Bytes are generally used for data storage.
Using bits will make the value eight times larger. So, to convert between the two you just need to multiply or divide by eight. However, the question is, why use bits not bytes? It is easy to jump to the conclusion that it is for marketing reasons. Using bits gives you a bigger number and is thus better than a lower number, but it is actually for other reasons.
The first reason is that in the very early days, and I mean the very early days, there was no agreement on the size of a byte. Since data transfer may include systems which define a byte as different sizes, a bit was used because, a bit is the smallest amount of data on all systems.
The second reason is that data is transferred in bits including control bits and other information. These bits may include checksums, data headers and other control information. Thus, when transferring data, you get an overhead on top of the actual usable data you get on the other side. Depending on how the data is transferred and its size, this overhead can fluctuate from being very small to being extremely high.
Thus, data transfer has always been measured in bits and is unlikely to change. When you are talking about data storage you are generally only looking at how much data there is and not any overheads used to store it, thus bytes are a good unit to measure it in.
Let’s now have a look at data transfer in more detail.
Data Transfer
When measuring data transfer generally, SI units will be used, that is multiples of one thousand. When defining data standards generally, the SI standard will normally be used and it will be defined in bits. For example, ethernet networks use speeds like 100 Mega Bits Per Second. In most cases when describing data transfers, bits will be used rather than bytes.
It is rare to see IEU units used for data transfer, but it is possible. As with data storage units, the abbreviations are different in that there is an extra lower-case i. Although in most cases bits will be used to measure data transfers, take note that if bytes are used the abbreviation will use an upper-case B rather than a lower-case b.
Now that we have an understanding of the different standards of units that are available, I will next have a look at how to convert between the two.
Converting
There are a lot of tools available to convert between different standards; however, Google provides a conversion tool. To access it, search for what you want to convert. In this case, I will search for mb to kb.
Google is pretty good at detecting that you want to convert something, for example, you could use it to convert between kilos and pounds, but if the convert tool does not appear try some different search terms.
In this case, I will change it to convert two Terabytes to Tebibytes. If you recall from earlier in the video, Windows reported the storage as having a capacity of 1863 Gigabytes. Due to rounding errors, the figure will be a little different to the one given by Google, but it should give you an idea of how the tool works.
You can also use the tool to convert between different data transfer units. To do this, select the pull down and select “Data Transfer Rate”. If you have the time, it is worth looking through the menu to see what else you can convert between.
In this case, I will select 100 Megabits per second. To get an idea how fast 100 Megabits will transfer, I will select Megabytes per second. You will notice that 100 Megabits converts to 12.5 Megabytes per second. This will give us an idea of the maximum speed that could be transferred on a network running at 100 Megabits per second. Of course, in the real world, you won’t get a speed as fast as this, but it gives you an idea of what you may get.
If I want to get an idea how fast a one Gigabit network may be, I simply need to change the value to one Gigabit per second. This is good for getting an idea of how much data can be transferred through a data connection but does not account for overheads.
File Transfer Calculator
There are a number of different online file transfer calculators on the internet, and in this example, I will look at the one provided by Tech Internets. There are others, but I find this one pretty simple to use and it has a simple interface.
At the top, notice that I have the option to define if a kilo is 1024 or 1000 bytes. In this example, I will choose the option 1000. For the size, I will enter in 1. Below this, I can set the size, which for this example, I will select Gigabyte. Thus, I am looking at transferring a gigabyte of data which would be one billion bytes.
For the speed, I will enter in 100. Notice by default it is set to Megabytes per second. Keep in mind that network speed is generally measured in bits, so I will select the option Megabits per second. This particular calculator has an option for overhead. By default this is set to 10%. Overhead can be difficult to select, since a lot of the time it depends on how much data you are sending. If you are sending a lot of small packets of data, the overhead will generally be a lot higher. If you are sending a lot of large packets of data, the overhead will generally be a lot lower.
I will leave it on 10% and press the button “Calculate”. In this example, the calculator has provided an estimate that a one Gigabyte file would take one minute and 28 seconds to transfer over the network with a 10% overhead.
Another good feature of this calculator is, that when I scroll down that you will notice that there are options for USB speeds. In this example, I will select USB 2.0 speed and press the button “Calculate”. In this example, the calculator has estimated it would take 18 seconds to transfer a one Gigabyte file over USB 2.0. Keep in mind this is the maximum speed of USB. In a lot of cases, most USB devices don’t operate anywhere close to the maximum speed. Thus, I would treat this as only a very rough estimate.
In The Real World
In the real world, SI units are always used for storage specifications and networking standards. The SI units are worked out on base 10, which essentially means everything is a multiple of 1000. Previously, I have seen storage with both the SI value and the IEU value on the back of the product box. This does not seem to occur nowadays anymore. Nowadays, manufacturers seem to only put the SI value on the box, which would make sense as that gives the larger value.
For IEU units, this uses base 2, and thus uses a multiplier of 1024. When you see the IEU standard used, for example MiB, essentially you will see an i in the abbreviations or the name. When this is used, this will be correct, since this is the only standard that uses that terminology.
The problem occurs in that legacy and SI standards use the same terminology. Since there were no standards back in the old days of computers, it technically was not incorrect. In 1998 when IEC was released, software and operating systems should have moved over to this new system. Nowadays, you will find old software and some modern software still using the legacy system rather than moving over to the IEC system. This can cause some confusion.
Keep in mind that when a standard comes out, it normally takes time for the market to adopt or not adopt it. There are of course a lot of arguments that go on before a new standard is adopted. However, it appears now that the standards are here to stay and thus, it is my belief that software and operating systems should be adopting them, however, that is just my opinion.
End Screen
Thanks for watching this video from ITFreeTraining. I hope this information gives you a better understanding of how to interpret how much data you have. Until the next video from us, I would like to thank you for watching.
References
“The Official CompTIA A+ Core Study Guide (Exam 220-1101)” page 5
“Picture: Matryoshka Dolls” https://pixabay.com/photos/matryoshka-dolls-russian-doll-crafts-6762789/
“Picture: Progress bar” https://pixabay.com/vectors/progress-bar-download-level-4659081/
“Picture: Snail” https://pixabay.com/vectors/snail-engine-turbo-move-speed-48182/
“Picture: cheetah” https://pixabay.com/vectors/cheetah-running-speed-animal-fast-48433/
“Picture: Rabbit” https://pixabay.com/vectors/vintage-sketch-cartoon-animal-1824837/
“Picture: Apple Slice” https://www.pexels.com/photo/a-close-up-shot-of-an-apple-7214602/
“Picture: Apple” https://pixabay.com/photos/apple-red-fruit-food-fresh-ripe-1834639/
“Picture: Cat fear” https://pixabay.com/vectors/black-cat-silhouette-fear-mouse-3629455/
“Picture: Pointing” https://unsplash.com/photos/EbYx4Vkup7I
“Picture: Inside Harddisk” https://pixabay.com/photos/hard-drive-hard-disk-hdd-disk-4699797/
“Picture: Network cables” https://unsplash.com/photos/40XgDxBfYXM
“Picture: Caution Tape” https://unsplash.com/photos/l4gDUtU7GB4
“Picture: Floppy disk” https://pixabay.com/photos/black-business-computer-computing-18295/
“Picture: Android” https://unsplash.com/photos/HfWA-Axq6Ek
Credits
Trainer: Austin Mason https://ITFreeTraining.com
Voice Talent: A Hellenberg https://www.frelancer.com/u/adriaansound
Quality Assurance: Brett Batson https://www.pbb-proofreading.uk