SAP HANA Savepoints

SAP HANA Savepoints are required to synchronize changes in memory with the persistency on disk level. All modified pages of row and column store are written to disk during a savepoint.

Each SAP HANA host and service has its own savepoints.
The data belonging to a savepoint represents a consistent state of the data on disk and remains untouched until the next savepoint operation has been completed.
The availability of a recent savepoint improves the restart time of SAP HANA, because less redo logs need to be applied to make the database consistent.

2. When is a savepoint triggered?

Savepoints are triggered in the following ways:

Scenario Details
Savepoint interval (automatic) During normal operations savepoints are automatically triggered when a predefined time since the last savepoint is passed. The length of the time interval between two consecutive savepoints can be controlled with the following parameter:

global.ini -> [persistence] -> savepoint_interval_s

Its default value is 300, so savepoints are taken in intervals of 300 seconds (5 minutes).

System command (manual) The following command can be used to execute a savepoint manually:

Soft shutdown A soft shutdown invokes a savepoint before the services are stopped.
A hard shutdown doesn’t trigger a savepoint. This can increase the subsequent restart time.
Startup After a consistent database state is reached during startup, a savepoint is performed.
Snapshots Snapshots are savepoints that are preserved for longer use and so they are not overwritten by the next savepoint.

3. Where can I find more information related to savepoints?

The following SAP HANA views contain savepoint related information:

View Details
M_SAVEPOINT_STATISTICS Global savepoint information per host and service
M_SAVEPOINTS Detailed information for individual savepoints

The following SQL statements of SAP Note 1969700 can be used to analyze savepoints:

SQL statement Details
SQL: “HANA_IO_Savepoints” Detailed information for individual savepoints
SQL: “HANA_IO_Snapshots” Snapshot information

4. Which indications exist for problems related to savepoints?

The following SAP HANA alerts indicate problems in the area of savepoints:

Alert Name SAP Note Description
28 Most recent savepoint operation 1977291 Determines how long ago the last savepoint was defined, that is, how long ago a complete, consistent image of the database was persisted to disk.
54 Savepoint duration 1977220 Identifies long-running savepoint operations.
66 Storage snapshot is prepared 2081405 Determines whether or not the period, during which the database is prepared for a storage snapshot, exceeds a given threshold.

SQL: “HANA_Configuration_MiniChecks” (SAP Notes 1969700, 1999993) returns a potentially critical issue (C = ‘X’) for one of the following individual checks:

Check ID Details
350 Crit. savepoint phases > 10 s (last day)
351 Crit. savepoint phase avg. (s, last day)
352 Crit. savepoint phase max. (s, last day)
355 Time since last savepoint (s)
357 Savepoint write throughput (MB/s)
358 Savepoints taking longer than 900 s (last day)
380 Age of oldest non-backup snapshot (h)

INSERT / UPDATE / DELETE threads may be blocked by a savepoint if they have to are in state SharedLockEnter waiting for a lock of type ConsistentChangeLock. See SAP Note 1999998 for more information.

5. Are savepoints online operations?

The majority of the savepoint is performed online without holding a lock, but the finalization of the savepoint requires a lock. This step is called the critical phase of the savepoint. During the critical phase all INSERT / UPDATE / DELETE and DDL operations are blocked. You can identify blocked operations by thread state SharedLockEnter and lock of type ConsistentChangeLock. See SAP Note1999998 for more information. The critical phase should normally not take longer than 1 to 2 seconds.

6. How can typical savepoint issues be analyzed and resolved?

You can use the following approaches to analyze and resolve typical savepoint issues:

Symptoms Details
Long running critical savepoint phases If the critical savepoint phases take several seconds, it is usually caused by bottlenecks in the disk I/O area. See SAP Note 1999930 for more information how to analyze and resolve issues in this area.
Low write throughput of savepoints You can check for the savepoint write throughput via SQL: “HANA_Configuration_MiniChecks” (‘Savepoint write throughput (MB/s)’) of SAP Note1999993. If it is below the expectations, you can proceed according to SAP Note1999930 in order to determine and resolve problems in the I/O area.
Significant time since last savepoint Make sure that the parameter for controlling automatic savepoints is set to a reasonable value, optimally the default value is kept:

global.ini -> [persistence] -> savepoint_interval_s

Also particularly long running savepoints that are not successfully finished for a long time can be responsible for a significant time since the last successful savepoint. See “Long running savepoints” below for more details.

Long running savepoints You can use SQL: “HANA_IO_Savepoints” (MIN_SAVEPOINT_DURATION_S = <min_runtime_s>) of SAP Note 1969700 to check for long running savepoints. A good value for <min_runtime_s> can be 900. This means that only savepoints with a duration of more than 15 minutes are displayed.
Depending on the output of this command the following situations can be distinguished:
Long running savepoints due to low write throughput:
If the write throughput (MB_PER_S) is significantly lower than 100 MB / s, you should proceed as described for symptom “Low write throughput of savepoints” above.
Long running savepoints due to high I/O write volume:
If the amount of data written to disk (SUM_SIZE_MB) is much higher than expected, you should at first check via SQL: “HANA_Tables_IOStatistics” (ORDER_BY = ‘WRITE’) of SAP Note 1969700 for specific tables with a high amount of I/O writes. Check for these tables from an application perspective if you can reduce the amount of changes.
Additionally it is possible that a high “prepare flush retry count” has a significant impact on the amount of data written during the savepoint.
Old database snapshots not related to backups You can look into M_SNAPSHOTS or use SQL: “HANA_IO_Snapshots” (SAP Note1969700) in order to check for currently existing snapshots. See SAP Note 2039883 for more information regarding database snapshots.

7. What is the prepare flush retry count and how can I optimize it?

As already explained, the critical phase of the savepoint requires a table lock. In order to make sure that the critical phase is as short as possible, SAP HANA only enters the critical phase if it expects that the duration will not exceed a critical limit. If many memory pages are modified in parallel to the savepoint preparation it can happen that the critical limit will not be met and so SAP HANA starts another savepoint preparation. This retry activity can happen many times involving a high amount of additional write I/O.
You can check the column PREPARE_FLUSH_RETRY_COUNT the internal table SYS.M_SAVEPOINTS_ in order to see if a long running savepoint has performed a high number of retries before it entered the critical phase.
This mechanism is controlled by the following SAP HANA parameters:

Parameter  Unit Default Details
global.ini -> [persistence] -> savepoint_max_pre_critical_flush_duration
s 0 (SPS 08 and below)
900 (SPS 09 and above)
This parameter defines the maximum time (in seconds) that should be spend for optimizing the duration of the critical phase. The value 0 means that there is no time limit and so a savepoint can potentially run for a very long time without being able to start the critical phase.
global.ini -> [persistence] -> savepoint_pre_critical_flush_retry_threshold
ms  3000 This parameter defines an upper limit for the expected critical phase duration. As soon as SAP HANA assumes that the critical phase will be below this limit, the critical phase is entered.
Setting this parameter to a higher value (e.g. 5000 for 5 seconds or 10000 for 10 seconds) will reduce the number of retries before entering the critical phase.

These parameters are a trade-off between savepoint and I/O write overhead on the one hand side and locking situation during the critical phase on the other hand side. In order to come around high I/O overhead and very long savepoint times beyond 15 minutes, you can set the savepoint_max_pre_critical_flush_duration to 900 without having to expect a significant negative locking behavior.

8. What kinds of snapshots exist?

The following types of snapshots exist:

Type SAP Note  Creation Deletion
System replication snapshots 1999880 Regularly in order to provide consistent persistence state for system replication Automatically or – in exceptional cases – manually using hdbcons (“snapshot drop <snapshot_id>”)
Backup snapshots 2039883 Support of snapshot based backups Automatically by backup procedure or – in exceptional cases – manually using “BACKUP DATA DROP SNAPSHOT” (see SAP Note 1703435 for a use case)




Leave a Reply