in

Migrating On-Premise Oracle to AWS: My Lessons from the Trenches

Hey there! As a database engineer who‘s worked on both enterprise Oracle environments and cloud-based analytics systems, I‘ve had a front row seat to witness companies shifting their databases into the AWS cloud. From this vantage point, I can definitely confirm that the transition is accelerating. Reports show 80% of databases will be deployed or migrated to a cloud platform by 2022. Working hands-on through these initiatives has taught me that while conceptually simple, executing major database migrations brings some unique puzzles.

In this write-up, I wanted to pull back the curtain a bit and walk through a recent real-world project syncing an on-premise Oracle system into Amazon‘s cloud. My aim is to provide some transparent knowledge sharing coupled with practical recommendations to smooth your own migration Magnum Opus! I promise we‘ll tackle this step-by-step rather than diving too deep into the bits and bytes. Sound good? Let‘s get started!

"Quickly Lift and Shift Our Data to the Cloud, What Could Go Wrong!?"

I‘m sure you‘ve been there too. The C-suite reads some Forbes article about "Strategy X" powered by machine learning pixie dust or blockchain or fill-in-the-blank. Then this kind of vague-but-urgent dictate comes down through middle management for enabling said strategy by getting all that precious on-premise data into AWS pronto.

Our actual kickoff call went something like this:

"We need to be ready to build next-gen services leveraging AWS capabilities likeAnalytics and Machine Learning. Please copy our full production database into Redshift or Aurora so teams can start accessing cloud data ASAP. Just match what we have on-premise for now – we‘ll transform things later once we figure out the new architecture."

I won‘t lie, hearing this initially sent up some red flags. Even with today‘s amazing data migration tools, lift-and-shift projects that bypass planning or optimization rarely end well. Early compromises stack up as technical debt that needs repaying when scalability bogs down. However, the business imperative seemed clear here – so as the assigned DBA, my task was to make it work!

Before diving in on tooling though, I felt a responsibility to at least attempt getting key stakeholder concerns on the table:

  • Who is the main cloud data consumer? More analytical users have very different access needs compared to transactional applications.

  • How will standard reporting queries change? If large portions are deprecated, the optimal data model changes too.

  • What growth trajectory are we planning for?Cloud migrations often target scalability gains from the start.

I‘m always wary of the advice "We‘ll deal with it later, just copy the data for now!" Still, big enterprises tend to gain momentum around sub-optimal decisions. So the migration was kicking off regardless. Time get tactical!

On the positive side, specialized migration tools have come a very long way in smoothing these transitions.

Using AWS Schema Conversion Tool to Baseline the Migration

The AWS Schema Conversion Tool (SCT) plays a key role here. When pointed at an on-premise database like Oracle, it reverse engineers the existing schema and data model. This provides a solid starting point for mapping into the target AWS environment.

SCT generates a few super helpful outputs:

1. Documentation of the Current Source Data Model – Knowing what‘s in place allows appropriately shaping where we‘re headed.

2. Suggested Target Schema – Uses details of the selected AWS database to propose table structures.

3. Database Deployment Scripts – Allows easily standing up the target database with all necessary tables, indexes, etc.

Reference: AWS SCT Overview

What I typically find is the raw SCT output serves more as inspiration than final architecture:

  • Adjust for Access Patterns – Indexes, partitions, column encoding tuned for target use cases.

  • Simplify Overly Granular Structures – Analytical querying favors some denormalization.

  • Model for Growth Expectations – Storage limits and IO influence costs and scalability.

Still, having SCT‘s proposed schema in hand makes for a far easier springboard towards a well-formed target data model.

AWS Database Migration Service: A Workhorse for Common Transition Patterns

With source and target details mapped out, the actual replication of data falls to the AWS Database Migration Service (DMS). This tool boasts native connectivity an extremely broad set of databases and data warehouses both in AWS and on-premise.

Reference: AWS DMS Overview

DMS tackles migration via configurable "tasks" representing replicating an individual source table into a corresponding target. This granular approach allows tuning at the data set level based on unique characteristics.

Let‘s explore two very common use cases – one-time full migrations and ongoing sync.

Full One-Time Migrations: Simple Conceptually, Techie to Pull Off

For situations where the goal is to fully transition from on-premise to cloud infrastructure, a one-shot migration makes good sense. Here the path boils down to:

full data migration

1. Define DMS Tasks per Source Table: Straightforward, but lots of detail work to set up properly.

2. Performance Test and Tune Aggressively: Sooo many knobs for parallelism, memory, adaptive querying, etc. This is where skilled DBAs earn their keep!

3. Deploy Target Schema: Run those previously generated scripts to ready target tables.

4. Schedule Migration Time Window: Pick a maintenance window, pray to the database gods, and hold on!

On projects moving 100s of terabytes, I often end up treating this like a military operation:

  • Conducting nightly test runs for weeks beforehand to baseline
  • Building dashboards to monitor every possible DMS metric
  • Working hand-in-hand with infrastructure admins to ensure ample IO, memory and network bandwidth
  • Installing triggers to automatically rollback if thresholds are exceeded

Things definitely get interesting when relying on legacy components like crusty storage arrays! Still, solid planning and performance engineering allows eventually migrating even the largest enterprise databases. The cutover itself is usually anti-climactic if the groundwork is laid properly.

Live Migrations Through Ongoing Synchronization

Live database transitions aimed at keeping systems in sync during a transition period are trickier. This pattern leans heavily on a DMS capability called change data capture (CDC):

incremental daily sync

Here, rather than periodic batched data copies, DMS perpetually tracks and replicates source system changes. This works thanks to integration with the transaction logs from databases like Oracle.

Based on what I‘ve seen, CDC-based transitional migrations often run into complications like:

CDC Data Extraction: Choosing between native database log access vs DMS alternates – each with tradeoffs.

Schema Changes: Altering source tables can break replication, forcing DMS reconfiguration.

Performance Limitations: Large volumes of change can exceed CDC infrastructure capacity.

Basically, it takes very thorough testing to trust CDC will smoothly sync production systems without intermittent gaps. I guide teams to temper expectations, but some learning by failure is often unavoidable.

When stability issues inevitably crop up, old faithful batch-oriented migrations serve as a fallback for keeping daily change volumes in check. A bit messy perhaps, but necessities of pragmatism!

When All Else Fails: Custom Solutions to Save the Day!

I‘ll wrap up here with a remarkable situation that calls back to the opening premise – a mandate to enable live database migration no matter how massive or messy the source system.

In this case, our on-premise Oracle database topped 8 PB in size. It handled a principal line-of-business application mixing both transactional requests and analytics. Like many mature environments, there was a rats nest of complex cron jobs, ETL, and workflows relying on this system. All critical business operations would halt if it hiccuped.

Oh, and did I mention peers warned me daily change volumes frequently exceeded 0.5 TB?

Even with a muscular migration stack, keeping the cloud copy continuously in sync looked hopeless! Our POC ended up falling a full week behind on replication over just days of testing. I explored every trick I‘d used before to accelerate DMS – parallel tasks, using multiple migration instances, voting a batch process sacrifice to the data gods.

But the gap kept growing…it seemed we hit a hard technical limit.

Just when all looked lost, the solution came from an unexpected direction – the infrastructure team reached out suggesting a creative approach:

failure in db sync

1. Isolate Large Tables: We moved the biggest data offenders foreaching weekly batch loads.

2. Sharded/Partitioned Tables: This let us split replication across swarms of parallel DMS tasks targeting unique data slices.

3. Load Balanced Evenly: With 4 total DMS instances, we distributed load meticulously.

Though a bit Rube Goldberg, once tuned, this mash up allowed completing daily migrations within our SLA. Sometimes engineering success boils down to whatever it takes pragmatism!

Key Takeaways from the Trenches

I hoped walking through these experiences gives a transparent peek behind the curtain at real-world DB migration operations. While conceptually straightforward, once the rubber hits the road in enterprise environments, unexpected obstacles have a way of surfacing!

That said, the capabilities of modern data migration tooling like AWS SCT and DMS are extremely impressive. When backed by thoughtful upfront planning and unrelenting performance diligence, the outcomes can exceed expectations.

If relaying these stories helps even one team avoid similar pitfalls or spark creative problem solving, I‘ll be thrilled! I‘m always happy to chat more candidly on lessons learned. Never hesitate to drop me a line!

AlexisKestler

Written by Alexis Kestler

A female web designer and programmer - Now is a 36-year IT professional with over 15 years of experience living in NorCal. I enjoy keeping my feet wet in the world of technology through reading, working, and researching topics that pique my interest.