Microsoft Azure Cloud & Einstein Analytics
In this blog, I am focusing on data acquisition via Microsoft Azure SQL database in conjunction with Einstein Analytics replication connector. Microsoft Azure SQL database is a relational database as a service (DBaaS) offering from Microsoft for building, testing, deploying, and managing applications and services through a global network of Microsoft-managed data centers under the Azure Cloud offerings.
Like always, I am emphasizing more on utilizing declarative capability (Point & Click) as much as possible. Therefore feel free to try this out as we have free trial subscriptions available from Microsoft and Salesforce.
“Curiosity is the wick in the candle of learning” : – William Arthur Ward
In this fast-changing technology era, having a single source of truth is something every organization is thriving for. Currently, information is widely scattered over heterogeneous systems and not easily available for the end users to consume. In order to reduce this gap, there have been new replication connectors introduced on the Salesforce Analytics platform in every subsequent release. Special thanks to Salesforce product team as they have been working constantly hard to enhance the platform with new functionalities.
Based upon the data ‘Pull’ mechanism Einstein Analytics has several out of the box data connectors on the platform. Not only these can be scheduled individually but further down you can also prepare or transformed in your pre-post dataflow operations to simplify it for your end users on the dashboards. Below is one of the connector I will be using for my demo (Snapshot Below).
Snapshot: EA Azure Connector
Key Considerations (Before Testing Use case):
- If you are really new to Windows Azure workspace. Then you need to subscribe to the free trial Azure environment from the Link. You may have to provide your credit card or debit card information in order to activate your account. Don’t worry you won’t be charged until you don’t opt in exclusively for the paid services. It’s just to confirm the end user legitimacy on the platform Microsoft has this policy.
- You will get free subscription credit for up to 200 $ or up to 170 Euros which should be used within the first 30 days of sign-up and 12 months of select free services (services subject to change). Worth checking some T&C of free account link based upon your country location. Link
- Storage Limits: For example, you get 5 GB of File storage each month. If in a month, you only use 2 GB, the remaining 3 GB doesn’t roll over to the next month. To avoid getting charged, keep your usage within the limits. To learn about the limits of free services or you can monitor your usage from the Link. Else I would recommend deleting the database to avoid any charges. Or ask your company to give you some playground to build some meaningful use cases.
- You can always create free Salesforce Einstein Analytics account with no expire limit from the Link:
Einstein Analytics & Azure SQL Database.
Below are the key components I am using for my use case:
- Microsoft Azure SQL Database
- SQL Server Management Studio (SSMS)
- Salesforce Einstein Analytics Free Dev Org – (Could be same or any other standalone Reporting Org)
Use case Overview:
In order to simplify the use case for a broader audience. I have broken down the integration into a few steps.
- Step1: Database Creation in Azure Portal
- Step2: Setting up a server level firewall rule in the Azure Portal. (For ex. IP Ranges Whitelisting)
- Step3: Database connection establishment with SQL Server Management Studio
- Step4: Create Tables/schema with SSMS OR Used dummy existing table for your use case
- Step5: Azure SQL Database connection establishment in Einstein Analytics
- Step6: Azure based Dataset creation in Einstein Analytics.
Solution Approach (Deep Dive):
Step1: Database Creation in Azure Portal:
An Azure SQL database is created with a defined set of compute and storage resources. The database is created within an Azure resource group and in an Azure SQL Database logical server.
a) Click Create a resource in the upper left-hand corner of the Azure portal and select SQL Database from the right palette. (Snapshot Below).
Snapshot: Microsoft Azure Portal Overview Page
b) On the New page, create your first SQL Database and give it a meaningful name. Make sure to select the ‘Free Trial’ subscription for your selection. Then create a new server with the closest server location selection enabled in order to keep your database utilization as minimum as possible (Save your Database & Server credentials on the notepad). (Snapshot Below)
Keep the pricing tier ‘Standard’ for our demo use case and press ‘Apply’ (Snapshot Below).
Now either you can create a blank database then later assign your database tables via SSMS to this database. OR Like me, you can choose dummy test database already created ‘Sample (AdventureWorksLT)’ to check your integration. Don’t forget to pin this database to your dashboard so you will be seeing upfront when you log in to the Azure portal every time. Then press ‘CREATE’ (Snapshot below).
c) Now that we have completed the SQL Database form, click Create to provision the database. Provisioning takes a few minutes (Snapshot Below).
d) Now if you refresh the page or click at the very left top on the logo. This will redirect you to your dashboard and will showcase your newly pinned assets for example iamtb SQL Server & my EinsteinAnalyticsDB SQL database (Snapshot Below.)
Step2: Setting up a server level firewall rule in the Azure Portal. (For ex. IP Ranges Whitelisting etc)
The SQL Database service creates a firewall at the server-level that prevents external applications and tools from connecting to the server or any databases on the server unless a firewall rule is created to open the firewall for specific IP addresses. Below are the steps essential for whitelist our client IP Address.
a) Log in to the Azure portal. Link
b) Select SQL Databases (EinsteinAnalyticsDB) from the left-hand menu and click your database on the SQL databases page.
c) Click Set server firewall on the toolbar. The Firewall settings page for the SQL Database server opens (Snapshot Below).
d) Then add the client IP Address or give some meaningful name to your IP Rule (Snapshot Below).
Note: SQL Database communicates over port 1433. If you are trying to connect from within a corporate network, outbound traffic over port 1433 may not be allowed by your network’s firewall. If so, you cannot connect to your Azure SQL Database server unless your IT department opens port 1433 & whitelist the IP ranges. Further information regarding Salesforce IP Ranges information are mentioned in the following link in case if your DB is sitting on the corporate VPN then you have to whitelist the SFDC IP Ranges for your EA Replication connection enablement (Link). By default, access through the SQL Database firewall is enabled for all Azure services. Click OFF on this page to disable for all Azure services.
Step3: Database connection establishment with SQL Server Management Studio (SSMS):
We need SSMS integrated environment for administering & managing our SQL infrastructure and develop all components of Azure SQL Database for our demo. In case you have opted for a blank database above then you will create your DB Table in SSMS. Else you can access the already created table for your ‘Adventure Dummy Database’ in this environment.
Now in the next step, we will focus on establishing a connection to our Azure SQL Database server
a) Download SSMS: In case if it’s not installed already on your local machine please click the following link to download the latest version in your root directory – SSMS Download Link (Snapshot Below).
b) SQL Server Connection Information: Now we need to get the fully qualified server name for our Azure SQL Database server in the Azure portal. We need this information to connect to our server using SQL Server Management Studio. Please follow the below steps to get your server information to access the SSMS.
- Log in to the Azure portal- Link
- Select your SQL Database (EinsteinAnalyticsDB) from the left-hand menu and click your database on the SQL databases page.
- In the Essentials panel in the Azure portal page for your database, locate and then copy the Server name. (Snapshot Below)
c) Connect your Azure SQL DB with SSMS: Open SQL Server Management Studio and then in the ‘Connect to Server’ dialog box, enter the following information and then press Connect (Snapshot Below) –
- Server type: Database engine
- Server name: The fully qualified server name (The one you have copied)
- Authentication: SQL Server Authentication
- Login: The server admin account (This is the account that you specified when you created the server)
- Password: The password for your server admin account (This is the password that you specified when you created the server)
d) As I am using the pre-configured dummy (Adventure Works) database. Therefore I am having an access to all of the dummy tables & schema now in my server studio. These tables I will later try to bring it to my Einstein Analytics instance for example Product, CustomerAddress, Address, Customer tables (Snapshot Below).
Snapshot: DB Table Overview SSMS Studio
Snapshot: Customer Table SSMS Studio
Till now we have successfully accomplished our first 4 steps from our use case. Now in the next steps, we will try to fetch (pull) these tables via EA Azure Connector in the replication cache. Then in the last step, we will consume these tables in our data flow to create our ‘Azure Dataset’ in our EA Instance.
Step5: Azure SQL Database connection establishment in Einstein Analytics:
In this step, we will log in to your Einstein Analytics instance. Make sure to have the replication enabled in your instance. This setting enablement is essential to activate the replication connector functionality in your instance.
a) Now go to your Salesforce Environment and then Analytics Studio. Then you need to press ‘Data Manager’ in the very right corner (Snapshot below).
Snapshot: EA Data Manager
b) Press ‘Setup’ from the left panel and then press ‘Set-Up Replication’ in the very right corner (Snapshot Below).
Snapshot: EA Set up-Replication
c) Select the connection type ‘Windows Azure SQL Database’ for your new remote connection (Snapshot Below).
Snapshot: EA Azure Connector
Then insert your earlier copied database credentials and give some unique name to this connection and developer name with some description and press save. Make sure to put the right syntax in the credentials. Esp. your schema name and JDBC connection URL and then press SAVE (Snapshot Below).
“Note: Make sure to remove the blank spaces and do the right formatting structure for your JDBC Connection URL” (Sample Below):
SAMPLE Azure Remote Connection Details Below:
- Developer Name: Any meaningful name
- Description: Any meaningful inputs
- Database: You’re SQL Database (For example EinsteinAnalyticsDB in my case)
- JDBC Connection URL: <host> : <port>, default port is 1433 (For example: iamtb.database.windows.net:1433 in my case)
- Username: You’re username for your database (For example : varmehta in my case)
- Password: You’re DB Password.
- Schema: You’re DB Schema. Else those who are using the dummy table then put ‘SalesLT’ for your schema inputs. Below is snapshot how you can identify the schema information from SSMS.
Snapshot: DB Schema.Table
Below is the quick snapshot from my EA instance :
Snapshot: My EA Azure Connection Details
If everything you have followed properly then you can see your Azure Database table in your EA Org. If that’s the case then you all did a fantastic job 🙂 else just try again some pieces were missing you will get there.
d) Select the table and then Press Continue. By doing this you can enable the selected table to be pulled out for your EA instance. In my case, I am enabling Product Category for the snapshot. Rest all are greyed out as I have already replicated them and also you can see the status ‘Enabled’ next to the table. Then press ‘Continue’ (Snapshot Below)
Snapshot: Azure EA Replication Sample Table
Snapshot: Product Category Enablement
Below you can see your table attributes and for now, I am selecting all. Then Press Continue. From this action, we are trying to bring the selected attributes to our EA instance. You can select the relevant attributes for your use case then Press SAVE.
Snapshot: Product Category Table Attributes
Replication Run: As you have selected the attributes from your Azure DB instance to be fetched for this replication job run. In the next step run your replication job and monitor the job in the data manager. Else you can run the individual table at the very right icon for your run. For the demo, I am re-running all (Snapshot Below).
Snapshot: Replication Initiation
Snapshot: Replication Monitor (Azure DB – Data processing Initiated)
Snapshot: Replication Monitor (Azure DB – Data processing Successful)
Step6: Azure based Dataset creation in Einstein Analytics.
In this step, we will ingest our replicated Azure table in our data flow to register the dataset. In order to simplify the use case, I am just bringing one Azure source node and registering it straight as my first ‘Azure_Dataset’ in my EA instance. Now it’s in our EA instance that also means it’s further available to transform or slicing & dicing based upon our business needs. In order to bring the replication table as a source node, you have to click umbrella icon from your dataflow palette (Snapshot Below).
Snapshot: EA Dataflow Designer
Run Dataflow: Once you have created your nodes then simply update and run your data flow. In my case 847 Azure customer data rows have been processed successfully (Snapshot Below).
Snapshot: EA Dataflow Monitor
Voila 🙂 you can see your Azure based dataset in your org now. we all deserve a pat on the back. (Snapshot Below)
Snapshot: EA Azure Dataset
Snapshot: Azure Value Table View
- In the above scenario, I have demonstrated the external data integration (Azure Data) to feed my target dataset in my EA instance as a unidirectional flow. But indeed this can be easily extended in case if you want to write back the data from EA to an external source via EA API’s
- Think ahead and focus more on designing your table schemas. You really need to have the right data integration strategy in order to perform these operations with right counterparts. When to perform Delta? When to append? When to overwrite? These are business-relevant questions for your use case. Don’t throw everything & clean later else you will be stuck in the loop of so-called optimizing.
- This use case can easily be extended to your ETL. So your predefined ETL tool can ingest and encapsulate & transform the business logic. In the end, your data flow will have to do the less heavy lifting and your SAQL queries will perform better on your dashboard.
Last but not least I would like to thank my dear friend Gugi helping me finalizing the architect for this article. For the new audience on this EA Connector topic.I hope it gave you all a fair idea about external data integration via Azure in Einstein Analytics. Let me know your feedback or suggestions.