This Is How The Data Backup Actually Works
Whether in a private or professional environment – data backup are often only regarded as an annoying evil and are accordingly neglected. In times of increasing numbers of ransomware attacks, they are often essential for survival in the truest sense of the word.
For example, when it is necessary to have the encrypted IT systems ready for use again in the shortest possible time to continue business operations.
Backup definition
A backup enables users to restore deleted, overwritten, or manipulated data. This prevents the loss of important information. Often this is done via offline copies or an external hard drive, but a cloud is also a backup option.
Such security measures are moving more and more into focus in the age of digitization because nowadays, vast amounts of data are produced, processed, and read.
For example, private individuals save their digital documents, such as bank statements or invoices, online and maintain extensive archives with smartphone photos. And in the professional environment, for example, in connection with Industry 4.0, enormous amounts of business-critical data are generated, often relocated to the cloud for space and administrative reasons.
However, there is always the risk of data loss through accidental deletion, a hardware failure, or a cyber attack. As a counterpart to the backup, the recovery process – as restoring the data – should not be neglected.
Against this background, 2,050 executives from 19 countries were surveyed as part of the current ” Veritas Vulnerability Lag Report. ” The result: On average, companies will need two million euros and 24 new IT employees in the next twelve months to close the security gaps in their IT systems and to adequately protect their data.
Eighty-two percent of those surveyed also stated that their company had suffered from downtimes in the past few months. So what do companies have to look out for in a modern backup system?
Create backup – four options
There are four important ways to create a backup:
Full backup
With this type of backup, companies can back up all data on a server, for example. It is the foundation for disaster recovery. A full backup is recommended when backup sources have changed or updated. This also applies to significant adjustments to the configurations of the operating system and applications.
This variant’s advantage is that a backup of the entire system is always available in a backup set. This means that all information is in one place or a “data set” – there is no need to search, and the recovery times (RTO for short) are comparatively short. However, this is a relatively inefficient solution because the same data is always copied.
Incremental backup
In this case – unlike with a full backup – not all data is backed up in general. Instead, it is about capturing and backing up new and changed data with every backup.
This leads to more efficient storage space than more frequent full backups. In this case, however, the data backups are distributed over several backup sets. Therefore, a restore can take longer, especially since the individual backup sets have to be restored in the correct order.
Differential backup
Such backups include data that has arisen since the last full backup. This approach is based on the idea that, in general, the rates of change of the data are relatively low compared to the total amount of data. This minimizes the time required to perform the backup.
Another advantage compared to the incremental backup method is that a maximum of two backup media is required at the time of data recovery. And the fewer media in use, the lower the risk that a restore job will fail due to a media failure.
At the same time, this approach enables speedy recovery compared to recovery from incremental backups. However, all files taken into account have been added or modified since the last full backup. This leads to larger backup copies, which put a correspondingly higher load on the memory.
Synthetic backup
Synthetic backup, also known as catalog-controlled backup, is used when a full backup is impossible due to prolonged or too small backup windows and network traffic.
Synthetic backups are generally based on incremental backups or deduplication technologies. Deduplication achieves a reduction in the memory required for data backups by ideally deciding at segment level whether the relevant data is unique or already known.
Accordingly, the deduplication mechanism uses the already known segments and connects them to the backup at a logical level. This technique maintains a table of catalog information and updates it as data is added, modified, or deleted. A logical full backup is always available for the restore via the backup program logic.
In particular, if deduplication is carried out on a server to be backed up (CSD or client-side deduplication), this significantly reduces the load on the network infrastructure, and shorter backup windows can be achieved.
In the event of a restore, the backup program intelligently combines full backups from the increments or segments and thus enables a prompt full backup restore.
Data backup – identify dark data.
Data management is closely related to the creation of backups. After all, companies should only create necessary backups. “Garbage data” – that is, trivial and superfluous data – should be sorted out before creating backups.
In this context, companies must examine their storage for so-called dark data. This is information whose value to the company is unknown. According to studies by Veritas, their share is now over 50 percent.
Securing superfluous information ties up resources without drawing any added value from them. The following steps will help identify and avoid dark data:
- Identify and view all data sources: Data mapping and data discovery are the first two measures to understand the data flow in their organization better. This gives you an overview of your data.
- Prevent dark data in the long term: Companies should only store earmarked data for a specific purpose. With meta tags and flexible rules for storing the data, irrelevant information can be more easily tracked down and deleted without risk. In addition, self-services allow authorized and privileged users to intervene earlier in the dark data creation process.
- Carry out automation: An independent process is recommended for analyzes, tracking, and reporting. As a result, companies can better identify dark data based on automated reports.
Cloud backup: Back up data in the cloud
The cloud backup heard in many companies is already part of everyday life. Because it offers a lot of space and a high level of user-friendliness. Accordingly, the associated backup systems are often easy to use.
IT managers can thus back up rapidly growing amounts of data and workloads and, if necessary, restore data – even in complex local environments data centers, Private and public clouds.
With cloud backup, users benefit from four factors:
- Free choice of provider: The range of cloud service providers (CSPs) is significant. Therefore, users can distribute their backups to different cloud providers, depending on which one provides the desired functions. For example, Google offers extensive analytics, Microsoft focuses on legacy applications, and Amazon Web Services attaches great importance to cloud storage.
- Independence: A company that works with only one cloud provider becomes dependent. Because the vendor lock-in can result in the transaction costs for a change from one CSP to another becoming uneconomical above a certain amount of data, a multi-cloud is therefore recommended in this context. This allows the data to be distributed to different providers and makes it easier to move data.
- Security: If the backups are stored at several CSPs in different regions, increased stake. Because no service is entirely fail-safe: natural disasters, hacker attacks, or human error can occur unexpectedly. However, there is always a functioning backup stored in a distributed manner.
- Scalability: A considerable advantage of the cloud is the billing according to storage space and computing cycles for the used workloads. Corporations with their own data center can thus decide where to store their data flexibly. Long-term backups should be stored in inexpensive cloud storage, while critical data should be stored locally.
Worst-case backup scenario: disaster recovery
If an emergency has occurred and companies need to restore data, all the necessary steps should be determined in advance. In this way, smooth recovery processes can be implemented. The following five steps must be observed:
- Define the basics: Every application in a company corresponds to a specific business value. The more critical the application, the faster it has to be restored – the shortest possible time window of around 5 minutes is not uncommon. In addition to these so-called recovery time objectives (RTOs), the maximum amount of data an application can lose during the failure must also be regulated. This is reflected in the recovery point objectives (RPOs). They describe the frequency of the backup intervals.
- Independently running processes: Automatic processes are an enormous help for companies in quickly restoring data and applications. They also minimize the risk of mistakes. Therefore, a recovery with automated processes for failover, failback, and testing is recommended to avoid lengthy and expensive downtimes.
- Carry out test runs: Testing is essential to ensure functionality in the event of a crisis. In some industries, such tests are already included in the compliance requirements, for example, in the financial and health sectors. The tests must run in a sandbox without the production systems suffering. Thanks to these simulations, employees can better assess the duration and complexity of the process – an important finding that can save teams a lot of stress in a system failure.
- Central recording of different clouds: Companies often use other cloud solutions simultaneously. However, it does not make sense to use a separate tool for every environment. Because this approach entails higher operating costs and more extended downtimes and increases complexity, increasing the risk for the business processes. A comprehensive disaster recovery strategy, on the other hand, encompasses all areas and is more efficient.
- Addressing individual requirements: There are different scenarios for a recovery. For example, there may only be a few virtual machines involved. But also complex, multi-layer applications or even a whole data center are conceivable. The recovery strategy should be able to map these options flexibly and.