Iflash database12/17/2023 ![]() (Although I’m not suggesting you should forego due diligence and just hope everything works out ok…) Waste Watching ![]() Provide enough space and you shouldn’t need to worry about things like IOPS and bandwidth. In the case of flash memory, the opposite is usually true: by supplying the required storage capacity there will be a surplus of performance capacity. This means any overall solution you design must exceed the required storage capacity in order to deliver on performance. With disk drives, the performance capacity usually becomes the blocker before the storage capacity, particularly if the I/O is random (which means high numbers of IOPS). In simplistic terms, performance and storage capacity are linked, with the ratio between them being specific to each type of storage. And the thing about capacities is that bad things tend to happen when you try to exceed them. However, while every storage device must have a storage capacity, it will also have a performance capacity – a limit to the amount of performance it can deliver, measured in I/Os per second and/or some derivative of bytes per second. the number of bytes of data that can be stored. When people talk about disk capacity then tend to be thinking of the storage capacity, i.e. In the case of a 600GB disk short stroking may now result in only 60GB of capacity, which means ten times as many disks are necessary to provide the same capacity as a disk which is not short stroked. Of course this has a direct consequence in that a large portion of the disk is now unused, sometimes up to 90%. This is known as short stroking. If we only use the outer sectors of the platter (such as those coloured green in the diagram here), the head is guaranteed to always be closer to the next sector we require – and note that the outer sectors are preferable because they have a higher transfer rate than the inner sectors (to understand why, see the section on zones in this post). If latency is critical (which it always, always is) then one method of reducing the latency experienced on a disk system is to limit the movement of the head, thus reducing the seek time. In my previous article I talked about the mechanical latency associated with disk, which consists of seek time (the disk head moving across the platter) and rotational latency (the rotation of the platter to the correct sector). If that previous example makes you cringe at the level of waste, prepare yourself for even worse. Of course, disk arrays in the real world tend to use concepts such as wide-striping (spreading chunks of data across as many disks as possible to take advantage of all available performance) and caching (staging frequently accessed blocks in faster DRAM) but the underlying principle remains. Of your total disk capacity deployed, you can only ever use 10% of it. In other words, your spare capacity is stranded. If you were feeling lucky, you could take a gamble on trying to avoid any new workloads being present during peak requirement of the original database, but gambling is not something most people like to do in production environments. ![]() Any additional workload you attempt to deploy using the spare capacity will be issuing I/Os against it, resulting in more IOPS. At least, you can’t use it if you want to be able to guarantee the 20,000 IOPS requirement we started out with. ![]() That remaining 54TB of capacity? You can’t use it. 100 multiplied by 600GB is of course 60TB, so you will end up deploying sixty terabytes of capacity in order to service a database of six terabytes in size. So now you need 100 disks to be able to cope with the workload. Those 600GB drives can only service 200 IOPS each. How many super-fast 15k RPM disk drives do you need if each one is 600GB? You need about ten, more or less, right? But here’s the thing: the database creates a lot of random I/O so it has a peak requirement for around 20,000 physical IOPS (I/O operations per second). To avoid complicating things we are going to ignore concepts such as striping, mirroring, caching and RAID for a moment and just pretend you want to stick a load of disks in a server. Stranded CapacityĪs a simple example, let’s take a theoretical database which requires just under 6TB of storage capacity. The chances are it’s happening on your data centre too. But waste isn’t a problem confined to just the food industry. The story caused a lot of debate about the way in which we ignore the issue of wasted food – with Tesco being both criticised for the wastage and praised for publishing the figures. That’s about 33,000 tons for those of you who can’t cope with the metric system. Storage for DBAs: In a recent news article in the UK, supermarket giant Tesco said it threw away almost 30,000 tonnes of food in the first half of 2013.
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |