Automated Oracle DB Refreshes on AIX in Skytap
Oracle DB Refreshes on AIX on Skytap
There are a prevalent number of customers that are now choosing Skytap on Azure, and other cloud platforms to host their AIX footprint. Skytap has been increasingly popular due to its close connection to Azure Native and the fact that it is directly hosted within Microsoft data centers leading to production and test workloads deployed in Skytap.
Historically a very large number of workloads on AIX Power systems have hosted critical Oracle databases usually supporting main ERP, billing, CRM and other production systems. Besides being instrumental for the business continuity of the customers, they are also a main source of truth when it comes to testing customer, partner and business data as a whole.
One of the main challenges that we have come across always is how do we enable customers to refresh their testing environments quickly and safely with the most accurate and up-to-date data they have.
Non-Cloud Approaches
Thinking about refreshing data in Oracle DB-related test environment brings out a few main directions that customers usually take:
- Restore with a backup tool– where the regular backup of the database is used to restore the production data onto the test databases. Usually done by backup and DB admins.
- Manually copy tables – using Oracle instruments, admins copy and import certain parts of the DB into the test environment.
- Storage snapshots – using storage snapshots, the file system beneath the Oracle DB is copied instantly and restored in a test environment. Actions after that are performed to prepare that file system for testing.
The list of course is not extensive and concerns only the main implementations that we see prevailing. All of these approaches work on a certain level and have their advantages and disadvantages. It is when cloud migration comes into play, that one starts to think how to transfer them into the “new” world.
Taking Oracle DB Refreshes into Skytap
Skytap is an Azure marketplace service providing an IBM Power-based public cloud mainly for AIX and iSeries workloads. Because of the way that Skytap works with storage, one should be careful of how one implements AIX LPARs that host Oracle databases.
The first two approaches will work without any issues in Skytap. Backup obviously has its caveats as tape is not supported in cloud, however, once backup service is implemented, Oracle DB refresh for that will be similar as how it was on-site. Copying Oracle DB tables obviously is the same no matter where you are.
Now, when it comes to storage snapshots is where the challenges come in place.
Storage Snapshots in Skytap
There is no clear concept of storage snapshots in Skytap. Skytap is using the so-called templating of LPARs, which is a bit different then a pure storage volume snapshot in a few ways.
Templating can be done on “local” LPAR disks, which are the ones directly attached to the system and which cannot be attached to any other system in the environment, region and account. Templating of local disks basically snapshots the ENTIRE LPAR and you do not have the choice to choose which disks to make snapshot of.
Templating can also be done for multi-attached (MAS) disks, which are the ones created to share between LPARs. This is exactly where some granularity can be achieved. MAS disks are created with sets and it is exactly this set that you taking a snapshot when you do a template. This is important for how you structure your MAS disks when you migrate your systems into Skytap.
Thus, be aware of structuring Oracle DB file systems, which will need snapshots into separate MAS sets, so you can template them separately.
Automation of Refreshes
Having said the above, let’s delve a bit in how you would optimize and what you need to have in mind while you are doing that.
Usually when we automate Oracle DB refreshes for customers, they come to us with some script from before. That script usually does a few things:
- Puts the source Oracle DB into some form of consistent state – hot backup, stopped, etc.
- Performs snapshot on storage level
- Returns the source Oracle DB into normal functional state
- Mounts the snapshot to the target LPAR
- Does certain steps to re-name Oracle DB and file systems
- Starts the new Oracle DB
That usually is scheduled with a tool like ControlM (very widely used for outsourced customers), that performs a number of jobs and alerts for failures and successes.
What is the challenge in Skytap?
Refresh Automation Scheduling in Skytap
Within Skytap, obviously you have to change a few things.
First of all, if you want to start thinking about a new way to schedule things, you might take a look of the possibilities that Azure DevOps have to offer. That might not be enough for your requirements, but for the purposes of pure Oracle DB refreshes, it usually is.
We tend to integrate Azure DevOps with a small Terraform integration (if we have to create the environment every time) and a wider Ansible implementation, that will help us organize the API calls to Skytap.
This enables a central and secure storing of Skytap API token as well as scheduling of DB refresh runs and also alerting upon successes and failures.
Skytap Terraform and APIs
When you work with Skytap and come to do a number of automations you quickly realize that the Terraform module is widely undeveloped, so all you have is API calls. There are a few things that you can do with Terraform like create environment, attach subnets, etc. so if you want to deploy all of your resources in conjunction with Azure (if you deploy with Terraform there), no problem to use that. But everything else, you will need API calls, which we tend to do with Ansible. It is stable, easy to use, easy to document and we use it for the AIX re-configurations either way.
What do you need from Skytap API?
Once you put the Oracle DB in consistent state, you need to make a snapshot of the Oracle DB file system. As we explained, that needs to be on a separate MAS set, so you template as little as possible. Once snapshot is done, put your prod DB in normal state and continue. Next step, you need to mount that snapshot to the target LPAR. For that to happen, you need to stop the target LPAR.
Be careful how you structure your LPARs within environments, because operations within a LPAR environment (like stopping the LPAR), will make the environment busy. Be sure to check for that and confirm with loops that the LPAR is stopped. As sometimes Skytap API return successful state before it is actually done, be sure to give it a bit more time.
After that, delete the old volume, mount the new snapshot / template and start the LPAR. That will need validation steps as well, however, it will most probably complete quickly and you will be able to confirm from the Ansible worker that you have ssh to the system.
More configuration steps
These ones you usually have from the previous scripts, however, make sure you clean disk numbers and disk paths correctly as they are detected a bit differently for systems in Skytap.
As always, design well, document and automate and make sure to clean up after you are done – old templates and MAS disks are paid for and you do not need them.
If you happen to need more help during these automation steps, make sure to contact us. We can:
- Analyze your old scripts
- Build you DevOps and Ansible footprint
- Design your Skytap landing zone
- Optimize your script or write it from scratch
- Test and validate with you