Wednesday, 31 March 2021

Crx2Oak Migration tool Concepts and Demo

 Crx2Oak helps migrate data from older CQ versions based on Apache Jackrabbit 2 to Oak, and it can also be used to copy data between Oak repositories.


 

The tool can be used for:

  • Migrating from older CQ 5 versions to AEM 6
  • Copying data between multiple Oak repositories
  • Converting data between different Oak MicroKernel implementations. (S3DataStore to FileDataStore)


Some of its features are

  • The migration can be interrupted at any time, with the possibility to resume it afterwards.
  • Custom Java logic can also be implemented using CommitHooks.
  • CRX2Oak also supports memory mapped operations by default. Memory mapping greatly improves performance


Parameters
Node Store Options
--cache: Cache size in MB (default is 256)

--mmap: Enable memory mapped file access for Segment Store

--src-password: Password for the source RDB database

--src-user: User for the source RDB

--user: User for the targed RDB

--password: Password for the target RDB.

Version Store Options
--copy-orphaned-versions: Skips copying orphaned versions. Parameters supported are: true, false and yyyy-mm-dd. Defaults to true.

--copy-versions: Copies the version storage. Parameters: true, false, yyyy-mm-dd. Defaults to true.


Path Options
--include-paths: Comma-separated list of paths to include during copy
--merge-paths: Comma-separated list of paths to merge during copy
--exclude-paths: Comma-separated list of paths to exclude during copy.


Source Blob Store Options
--src-datastore: The datastore directory to be used as a source FileDataStore

--src-fileblobstore: The datastore directory to be used as a source FileBlobStore

--src-s3datastore: The datastore directory to be used for the source S3DataStore

--src-s3config: The configuration file for the source S3DataStore.


Destination BlobStore Options
--datastore: The datastore directory to be used as a target FileDataStore

--fileblobstore: The datastore directory to be used as a target FileBlobStore

--s3datastore: The datastore directory to be used for the target S3DataStore

--s3config: The configuration file for the target S3DataStore.

Help Options
-?, -h, --help: Shows help information.

Debugging
--log-level TRACE or --log-level DEBUG 

Demo Video Series

What are the option for asset migration between AEM instances?


There are cases where we need to move assets across AEM instances. This may occur when we have multiple asset repositories and as part of merging assets we may have to move them into one single instance. While moving assets we have to consider various factors like asset metadata, asset versions, Tags w.r.t assets all moved and validated.

The overall size of asset is another consideration while selecting any tools to migrate assets. Below given few tools which can be included in our consideration while we go for asset migration.


-Replication Agent
By configuring replication agent, we can migrate assets across AEM Instances

-Vault Remote Copy

Jackrabbit vault offers a simple method to copy nodes between repositories.

This can be used for bulk assets.

References:
https://jackrabbit.apache.org/filevault/rcp.html
https://experienceleague.adobe.com/docs/experience-manager-65/assets/administer/assets-migration-guide.html?lang=en#migrating-between-aem-instances
License: Jackrabbit Oak and any of its parts are licensed according to the terms listed in the Apache License, Version 2.0

-Grabbit

Grabbit provides a fast and reliable way of copying content from one Sling (specifically Adobe AEM) instance to another.
Grabbit creates a stream of data using Google’s Protocol Buffers aka "ProtoBuf". Protocol Buffers are an extremely efficient (in terms of CPU, memory and wire size) binary protocol that includes compression.

This is one of the Adobe recommended solution & can be considered for bulk assets movement.

Ref:
https://github.com/TWCable/grabbit
https://relevancelab.com/2019/07/04/get-moving-with-aem-and-grabbit/
License: Licensed under the Apache License, Version 2.0 https://github.com/TWCable/grabbit/blob/master/docs/LicenseInfo.adoc


-Recap

Crx sync option based on vlt rcp

Ref:
http://adamcin.net/net.adamcin.recap/

-Crx2Oak

Crx sync option  - tools provided by Adobe while upgrading AEM or for migration of crx data.
This is one of the Adobe recommended & can be used for bulk Assets

-S3 Asset Ingestor

This is part of ACS AEM Commons .It pulls files from an Amazon S3 bucket instead of the local file system. You can load a directory of assets into AEM very easily with this tool. Because of the ability to overload a server with assets, this tool only appears for the “admin” user right now.

Ref:
https://adobe-consulting-services.github.io/acs-aem-commons/features/mcp-tools/asset-ingestion/s3-asset-ingestor/index.html


What is AEP? Adobe Experience Platform FAQ



What is AEP(Adobe Experience Platform)
AEP helps to capture the customer journey. In AEP, data from various sources are stitched together using a schema. Thus an identity graph is built which is unique to a customer. AEP has a data lake where data from various sources are streamlined and fed into. This will be used to create profile data for a customer.




AEP is basically a combination of Real time customer profile + AI & machine Learning + Open Ecosystem

What are all the various data sets defined in AEP?
AEP has various data sets like,
Attributes: Characters like customer name, email, gender etc.
Identities: Unique identity info like ECID(experience cloud id), Email, membership id, phone no etc.
Segments : Categories like  online shoppers, gender, Location. [one such use case is, these segments can be exported to utilize in an email campaign]
Behaviors: Like login to the website, installed appication, added an item to cart etc.

What AEP Solves?
AEP solves below concerns.

  1. Disconnected identities.
  2. Slow and vulnerable data transfer.
  3. AI & ML operates in silos usually. Extraction of data is tough in such cases.
  4. Data governance is not enforced usually(CCPA, GDPR etc)
  5. Centralization of multiple features.


Capabilities or major AEP Features:

  • Create Actionable, real time intelligent customer profiles.
  • Enrich data and derive more insights with AI & ML & Data queries(SQL).
  • Innovate with open & composable components (Open APIS etc.)
  • Enhance delivery
  • Privacy and data protection(Privacy framework, consent offering, security )


AEP integration into other Adobe Cloud Applications
Adobe Experince cloud applications (Marketing cloud, Analytics cloud, Advertising cloud, commerce cloud) can be easily integrated to  AEP.

All customer attributes are fed into AEP from different applications.
For eg.
Adobe Analytics send data(when ever a data point is captured, immediately it goes into AEP), Adobe Target send data(decision made, content presented kind of data), Audience (send trace and audience), Campaign(profile and event data) can be easily fed into AEP via launch.

AEP uses Launch & websdk to import data directly into AEP from various applications.

How AEM or Forms utilizes AEP?
AEM can use this AEP data to personalize content on pages or forms.

Various AEP Implementation Phases & Roles responsible for.

Plan - (Leads and an enterprise architect will plan referring business goals and document it by defining KPIs)
Implement - (Data architects and engineers create data lakes(create model, schemas) and make available), ensure data integrity by query services.
Use - Marketrs, data analysts, data scientists(uses query service), application developer interacts with UI & start working towards integrating with other adobe applications(campaign,target, analytics)
Grow - People highlights the growth to the initial set of team (Enterprise architects, company lead etc)

Basic architecture of AEP

Through data ingestion (either third party ETL, ERP, Sales or Adobe applications via launch), we can ingest data into AEP data lake. The data resides as batches and files. Any data getting pushed also placed in Experience platform pipeline. Any data gets into AEP traverse to identity graph and profile store quickly.

The controls native to AEP are
Access control: specific permission rights to data & users
Data governance: to ensure data integrity
Experience data model systems: common data model, which cn be extended based on needs
Query service: SQL way of accessing data - it has connectors, so other sql tools can connect to this query service
Data science workspace: allows data scientists to create data models build train and deploy.
Intelligent services: like Attribution AI or customer AI - prebuild models can be configured to operate on data
Segmentation capabilities: for categorization. it includes streams and batches

Application Services
-Customer journey analytics - Combine all data from every channel. It has analysis workspace/ interface on top of AEP, helps visualize and explore all data from data lake.
-Real time customer data platform(CDP) - rich real time customer profiles, actionable insights, data governance. CDP has segmentation capabilities.
-Journey orchestration -  enables orchestration of triggered interactions like registration confirmations, location based information
-Intelligent Services -  Utilizing the AI & ML Capabilities to intelligently predict customer behavior.
-Offer decisioning - Build offer, apply decisioning & then deploy the right offer.

Use cases of AEP

1) Real time customer data platform - Stitch known and unknown data to activate customer profiles
2) Customer journey intelligence - Utilize the data driven methodologies, best practices AI & ML to enable real time decisions and actions
3) Delivery and cross channel experience - capability to deliver consistent and personalized experiences across all channels with the combination of platform and experience cloud products.
4) Customer experience application development - AEP gives an open and extensible platform for low latency access to profiles decisions and insights to create new customer experience applications.