In today’s world, all visual and audio data is now in digital format. Each new day in surveillance requires new terabytes of storage space for the thousands of cameras around various cities, major towns and regional areas; including shopping centres, factories and prisons, to name a few. The need for data storage grows exponentially with every new mega-pixel sensor and with every new technology (which was high definition [HD] video until only a year or two ago); today the 4k, tomorrow 8k and beyond. The surveillance industry is one of the largest consumers of data storage media because of the need for constant recording of information from cameras (with the associated time/date, GPS, various interfaces and so on). Hard disk manufacturers were ignorant of the CCTV and surveillance industry until it switched to digital storage. They then realised the size of this new data storage market.

Short-term Storage

Short-term storage is the recycling of the available storage space using the first in first out (FIFO) principle. In other words, if recorders have a drive capacity to store for only 14 days, an incident that occurred three weeks previously cannot be retrieved – it would have been overwritten by the new recording during the last 14 days. Some users call this short-term storage retention. It is very important for a surveillance manager to know what the storage retention is. If an incident occurs that operators did not notice, or the system did not pick up, then chances are it would be lost. The retention can be extended by simply adding more drives in the planning stage – some users may ask for six months or even 12 months storage – but it comes at a cost. Additionally, it takes physical space, consumes more power and takes more time for an operator to find an incident. The privacy laws and industrial laws in some countries may also limit the maximum storage retention a surveillance system can have.

Long-term Storage

Long-term storage refers to indefinite storage of information. This typically happens by backing up the detected incident from the short-term storage to another media form and storing the information for a longer term by not allowing it to be overwritten or erased. Some users refer to this as archived storage.

The media for long-term storage can be the same as short-term storage, or different. The current dominant storage technology is still magnetic hard disk drives, although solid state electronic storage in the form of flash drives, SD cards and solid state drives (SSD) are becoming more popular due to their affordability and capacity increase. Optical drives, in the form of CD-ROMs, DVDs, or even Blue-Ray disks are slowly becoming obsolete.

Long-term storage, in theory, is not really indefinite as eventually the media will lose its properties and can no longer be read. A hard drive may lose its magnetic particle polarization after many years, and the same can be said about optical or solid state media, although the numbers of years that it would take for this to occur has not yet been ascertained.

The inability to read long-term storage of video footage after many years of it being archived will be increased because of the hardware technology designed to read the data becoming obsolete, rather than the ageing of the media itself – remember the floppy drives or zip-drives which were used by some for backing up data around 15 years ago. The life expectancy of long-term data storage goes hand in hand with the life expectancy of the technology that is used.

Storage Capacities Today

Current video surveillance technology offers much more visual details than the old analogue video. HD video, with its 1920 x 1080 pixels, offers five times the number of pixels an analogue image offers when converted to digital pixels. The latest 4k, with its 3840 x 2160 pixel count, quadruples the HD pixel real estate, and it is 20 times the analogue pixel count.

When the original full frame PAL was converted to digital, it was called 4CIF (or D1) resolution. If it was not compressed, it would occupy around 170Mb/s bandwidth. This was a vast amount of data to be stored on the old PATA (Parallel Advanced Technology Attachment) drives, especially if there were multiple channels.

So, there was no choice but to start using video compression, which at the time (about a decade ago) was the broadcast-proven MPEG-2. This was the same video compression used on DVDs, and visually it appeared no different to the uncompressed stream, although the 170Mb/s raw stream was squeezed down to 4Mb/s of MPEG-2.

The introduction of the HD standard in surveillance over 10 years ago came after broadcast television became very comfortable with it. The uncompressed 720p HD and 1080i HD produced nearly 1.5Gb/s streams. The 1080p produced 3Gb/s stream for just one camera. Although these streams are not so difficult to handle in broadcast studios, as soon as an HD stream needed to be stored, or transmitted via cable and such, there was no choice but to compress the data.

MPEG-2 was designed to cater for HD video, resulting in a stream of over 20Mb/s, which was still pretty high, so new video compression methods were sought. This resulted in H.264 video compression, which is the most common codec used today, reducing the HD video stream down to 4-6Mb/s; basically, the same stream size which was handling secure digital (SD) video using MPEG-2. Having a similar bandwidth as SD, H.264 made usage of HD very convenient for the storage length, but also for the network.

The latest trend now of 4k video sensors and cameras, also referred to as Ultra High Definition Television 1 (UHDTV1), is another huge leap in pixels and results in raw streams of over 12Gb/s. Clearly, even more efficient video compression was needed. Although H.264 can compress 4k, the efficiency needed to be higher which lead to the development of H.265.

H.265 offers approximately twice the efficiency of H.264, reducing HD streams to 2Mb/s and producing a 4k compressed stream of around 6Mb/s. And while H.264 video compression is computationally more intensive then MPEG-2, H.265 is more demanding yet again. Surveillance cameras would usually compress data using their built-in hardware encoders, but the viewing workstations would require all the decoding to be done in the viewing client software. The more camera streams needing to be displayed, the more decoding power that is required. This puts large demands on the operating system (whether it is 32-bit or 64-bit), as well as on the processing power of the main central processing unit (CPU) aided by the graphical processing unit (GPU) resulting in the need for relatively powerful systems to run this type of CCTV system.

Data Storage Requirement with 4Mb/s Streams

An average compressed stream of 4Mb/s, irrespective if it is SD with MPEG-2, or HD with H.264, or even 4k with H.265, would require storage capacities for hours, days, weeks, months, half-year and yearly recordings as shown in Table 1.

So, what storage capacities are available today? Firstly, the maximum readily available magnetic hard drive today is 14TB~16TB using Serial ATA (SATA) format in its 3.5” physical form factor, available through Seagate and others. Secondly, the largest magnetic 2.5” drives currently are 5TB, by Seagate (5TB BarraCuda ST5000). Thirdly, the largest readily available SSD is apparently 60TB, as a 3.5” form factor, also available from Seagate. Fourthly, the largest SD and micro-SD memory cards, which some Internet Protocol (IP) cameras are using for edge storage, are now 2TB, as claimed by microSDXC.

Practical Examples

So, if viewing Table 1 and wanting to have a 32-camera surveillance system, with 4Mb/s streaming of H.264, using just one 8TB drive, up to six days in continuous mode (no motion detection recording) could be recorded.

Data storage requirement

Assume now that coverage is required for a factory with 32 cameras, and motion detection triggered recording is being used. If the factory operates for eight hours a day, seven days a week, it is estimated that about one-third of that time the cameras will see movement, and therefore the surveillance system will be recording for 33 percent of the time. This is used in Table 1 as VMD with 33 percent activity. Clearly this is an approximation, as some cameras will have no movement during the eight-hour day, whereas some may have more than that (such as the reception area, visitors, cleaners attending after hours).

In another scenario with the same factory example, the owners now want to have a whole year recording in continuous mode; over 480TB of storage would be required. This equates to over 80 drives (each of 8TB capacity). No computer or recorder will host 80 drives in one chassis today; therefore, it is necessary to split this storage, keeping in mind the amount of data traffic that the network switches are capable of transferring between the cameras and the storage. The amount of data that can be written to the drives needs to be considered, and also allowances made for the playback and archiving data to be transferred out of the same storage.

So, in this example, even with only 32 cameras, using 4Mb/s stream, assume this to be at least 128Mb/s (4 x 32), but at least three-fold of this number to assume a worst-case scenario of operators viewing the same 32 streams (assuming they have enough CPU and GPU power to view 32 channels) should be allowed for, plus allowance for back up. This makes it now close to 400Mb/s of data throughput of one recorder, from the hard disks to the network switch.

Do not forget that SATA standard revision 3 quotes a theoretical maximum of data throughput of 4.8Gb/s. In reality, it has sustained transfer rates of much less than that as the magnetic spinning disk has mechanical limit, which depends on the mass, the disk and the power consumed. If network overheads of at least 50 percent are added, a good 1Gb/s network is needed for the 32 cameras in the example. Often, there are more than 32 cameras in a system, on one network. This increases the network switching demand to much higher than 1Gb/s.

This is where data planning is crucial for larger projects, and discussion of such is beyond this article. Suffice to say that a non-IT person should be aware of the many bottlenecks in a digital IP system, including the camera sensor read-out speed, its video compression, network interface speed and efficiency, storing it on the hard drive, retrieval for playback and decoding ability of the viewing stations.

The next big thing to consider is the imminent hard disk failures, especially when such long-term storage is required. This is where RAID-1, RAID-5 and RAID-6 redundancy configurations are important. These add to the number of drives calculated above. Various RAID configurations have been documented and discussed elsewhere in various books, on the Internet and in manufacturers’ manuals, so those interested can seek further reading.

For quick and easy calculation of the required storage capacity for a given number of cameras and required length of recording, Seagate storage calculator you can also use my latest application ViDiLabs calculator, available both on iOS and Android smart devices.

Vlado Damjanovski is an author, inventor, lecturer and closed circuit television (CCTV) expert who is well known within the Australian and international CCTV industry. Vlado has a degree in Electronics Engineering from the University “Kiril & Metodij” in Skopje (Macedonia), specialising in broadcast television and CCTV. In 1995, Vlado published his first technical reference book – simply called ‘CCTV’, one of the first and complete reference manuals on the subject of CCTV. Now in its 4th edition, and translated into four languages, Vlado’s book is recognized the world over as one of the leading texts on CCTV. Vlado is currently actively participating and contributing towards the IEC 62676 internal IP VSS Standards on behalf of the Australian Industry.