A CLOSER LOOK AT TEXT ANALYTICS (SCORE SENTIMENT) IN POWER BI DESKTOP NOVEMBER UPDATES

This post is a follow up on my previous post of Sentiment Analysis in Power BI Desktop. In Power BI Nov updates, some AI transformation capabilities have been added to Power Query including sentiment analysis which is part of Text Analytics. You will see the AI Transforms in Power Query Home ribbon.

AI Transform

If you don’t see it, you need to enable it by going to File, Options and settings, Options and then tick the box for AI insights function browser. The Learn more hyperlink has not been updated to take you to relevant page as of this writing. It will take you to Microsoft main page instead, I hope this will be rectified soon. preview feature.PNG

To use these new AI features, you need to have access to a premium capacity backed workspace. If you don’t and clicked Text Analytics for example, you will see below error message.

error message for no premium capacity

If you do, this is what you will see. Select Score sentiment in the left navigation pane for Text Analytics and in the Text drop-down select the text column (Comment field is the one in my sample data) you want to do sentiment analysis and click OK.

ai with premium capacity

A Score sentiment column has been added to give you the score for each text value in Comment column. Also notice that behind the scene, a Score sentiment function has been added.

1.PNG

If you look at the M code generated for the Table query, a M function named AIFunctions.PostProcess has been called to give you the output of Score sentiment column.

M code

As of this writing, Power Query M function reference has not been updated yet so I can’t find any reference there.

2

However if you paste below M code in a blank M query, you will see all AI functions reference in Power Query and if you click the blank space in column Value for the first row, you will see the definition for AIFunctions.PostProces.

let
Source = #shared,
#”Converted to Table” = Record.ToTable(Source),
#”Filtered Rows” = Table.SelectRows(#”Converted to Table”, each ([Name] = “AIFunctions.Capacities” or [Name] = “AIFunctions.Contents” or [Name] = “AIFunctions.ExecuteInternal” or [Name] = “AIFunctions.GetAutoMLEntity” or [Name] = “AIFunctions.PostProcess”))
in
#”Filtered Rows”

3.PNG

If you have read my previous post, you will see that performing Text Analytics in the latest Power BI desktop is a lot easier if you have access to a premium workspace. When you publish your PBIX file, you will need to publish it to a premium workspace as well so it will continue to work in Power BI service.

Sentiment Analysis in Power BI Desktop

Edit: It’s not a very intuitive process to do sentiment analysis in Power Query. Good news is that from sometime next week text analytics will be accessible directly from Power BI Desktop. Read below for the announcement from Microsoft or you can read my new post to see how this works in Power BI November updates.

https://powerbi.microsoft.com/en-us/blog/weaving-business-intelligence-into-the-fabric-of-the-organization-with-microsoft-power-bi/

In my last post of book review, the author demonstrated how to use dataflow AI capability for sentiment analysis, but that’s a premium workspace feature. In this post I will show you how you can do sentiment analysis in Power BI desktop.

1. You need to have a Azure subscription. It’s free to sign up and you will also get bonus $280 credit to use Azure services for 30 days.

5

2. Log into Azure portal and click All resource in the left pane and click Add.

6

3. Search Text Analytics and click Create

4

4. Fill required fields and click Create. Note that I have selected the free F0 pricing tier.

3

5. Once deployed, click on Keys in the left navigation pane and you will see 2 keys. We will need one of them later. 8

6. To keep this demo simple, I have copied some restaurant review for a place near my office and I have added index column as id. All done in Power Query with a couple of clicks. You need to ensure that text column doesn’t contain empty values and id column doesn’t have duplicates. Also if your id and text columns have different names, rename them to id and text respectively or the Text Analytics will not work.

 

 

7. Now rename your query to Message. In the Queries pane right click on the Message query and select Reference, then right click on the newly generated query Message (2) and rename it to SentimentAnalysis.

7

8. With the SentimentAnalysis query selected in the queries pane, click on the fx button in the fomula bar, you will see a new step called Custom1 added in the Applied Steps pane.

9.png

9. Change the fomula to below and you will notice a Json file has been created. This is required because the Text Analytic API needs Json file as input and it will also output Json file later in the steps which we can easily convert to table format with Power Query.10.PNG

10. Copy one of the Keys from step 5 and create a new Power Query parameter, name it APIKey or something you like and paste the key in Current Value and click OK.

11

12.png

11. Click fx in the formula bar and a new step Custom2 will be created and then replace the M formula with below formula (replace the red text with name from step 4). At this point you will see a Json output has been returned from the API call.

= Web.Contents(“https://textanalysis-demo.cognitiveservices.azure.com/text/analytics/v2.1/sentiment”, [Headers=[#”Ocp-Apim-Subscription-Key”=APIKey], Content=Custom1])

13

12. Double click the Json icon you will see a record with the documents and errors fields and with List objects as values.

14

 

13. Select the List object of the documents field and then click To Table to convert the list to a table.15.PNG

14. Leave the To Table window as is and select OK.

16.PNG

15. Click the expand button, uncheck the Use original column name as prefix then click OK.

18

16. Now you will see the id of the text and score returned by the API call. Change the data type for id column to Whole Number by right clicking on id column and select Change Type then finally select Whole Number data type. What’s missing at this point is the text field which we can bring back easily with Merge Queries feature in Power Query.

19.PNG

17. With the sentimentAnalysis query selected in the query pane, click Merge Queries button in the Combine section of Home ribbon. When you see the Merge window, select the Message query from the drop down and make sure id column in both tables are selected then click OK.

20.PNG

18. Click the Expand button, uncheck id and click OK.

21.png

19. Now you will see the text along with id and score.

22

P.S. With the free pricing tier, you can only make 5,000 API calls for free. So don’t not make excessive API calls. Also limit each message under 5,000 characters to avoid any errors and pass less than 1,000 rows of text per API call. Please also disable Power Query background data refresh or you will reach the limit very quickly if you need to do some serious text analytics.

22

You can download the sample file here. You need enter your own Azure Text Analytics service Key to be able to use the file.

BOOK REVIEW – POWER BI MVP BOOK CHAPTER 10 AI In Power BI

Chapter 10 covers some of the Artificial Intelligence (AI) capabilities in Power BI Service and Power BI desktop. The author showcased how to set up sentiment analysis and image tagging in Power BI dataflow. Dataflows, along with Dashboards, Reports, Workbooks and Datasets are artifacts of Power BI workspace. However, AI features of dataflow only works with Power BI workspace in premium capacity.

2

If you try to use AI feature of dataflow in workspaces not backed by premium capacity, you will get below error.

1

In chapter 10, the author also introduced Key Influencer visual in Power BI desktop. This is the first visual released by the AI team in Power BI. You can watch some demo videos from Guy in a cube YouTube channel here and another one form Power BI YouTube channel here.

In my next post I will walk you through how to set up sentiment analysis in Power BI desktop if you don’t have access to Power BI premium workspace.

BOOK REVIEW – POWER BI MVP BOOK CHAPTER 8&9

Chapter 8 is introduction for DAX and it covers all the basics about DAX which includes:

  • Calculated Columns
  • Measures
  • Calculated Tables
  • Calculate Function (the most important DAX function)
  • Variables
  • Time Intelligence Functions

It’s a good start for beginners. If you already have working knowledge about DAX, you may want to skip this chapter. If you want to learn more about DAX, I highly recommend Marco Russo and Alberto Ferrari’s book here .

1.PNG

Chapter 9 talks about the difference and similarities between Excel and Power BI. Excel, as part of Office 365, has a built-in add in called Power Pivot, together with Power Query and Power Map enable business analyst to create BI models right inside Excel. Before we had Power BI desktop, that’s how business people create self service BI reports in Excel and share those reports with end users on SharePoint on premise. Power BI, on the other hand, is another (the preferred) way of creating BI reports in Power BI desktop and share reports online by publishing your model to the cloud. (You can also share Power BI reports locally with Power BI Report Server). I started building Power Pivot models back in 2010 so Excel BI always has a place in my heart 😊.

Both Excel and Power BI use the same engine behind the scenes. They are actually SQL Server Analysis Service Tabular model (SSAS Tabular) running on your local machine. Once you master either Excel or Power BI, it’s an easy transition to build larger data model in SQL Server Analysis Service. By the way, if you master both of them, you may consider getting certified for MCSA: BI Reporting.

 

 

 

Find Strings of One Column in Another

I came across a very interesting problem recently. I want to find out whether the text string in one column exists in another. Take the first row for example, I want to see if the string (‘Apple Pie’) in column 1 actually contains the string (‘pie’) in column 2.

0

I always thought that DAX function SEARCH and FIND are used to find a fixed text string in one column, like below,

12

However, when I tried below, it also worked. Happily surprised 😊. FIND returns error here because it does case sensitive search. Even you only have one row returns error, the whole column will return error, very disappointing!

3

Then I started wondering whether DAX has other functions that can do the same thing? Sure it does. CONTAINSSTRING and CONTAINSSTRINGEXACT do just that, but you need to switch the column sequence here where the first parameter is the ‘Within Text’ (‘Find Text’ is the first parameter for SEARCH and FIND). CONTAINSSTRINGEXACT is case sensitive, but it doesn’t return error like FIND does.

So, you know which function you should use in the future. CONTAINSSTRINGEXACT wins over FIND for case sensitive search.

45

Can you use these 2 functions to find fixed text string? oh yeah!

 

6

Some folks like to do things in Power Query and it’s actually best practice that you should add custom columns in Power Query with M functions whenever possible. So let’s see how we can do this with M. Text.Contains function does the same thing here, but it also has the optional third parameter for case insensitive search. Leave it out if you want to do case sensitive search.

7

In summary, if you want to find out if text string in one column exists in another, you should do it with M as it’s more flexible and best practice to do so. Only use DAX calculated column when you can’t achieve what you want to do with M.

 

BOOK REVIEW – POWER BI MVP BOOK CHAPTER 6&7

These 2 chapters walk you through the importance of building a star schema data model and establishing many to one relationships between your tables. For simple data model, you will usually have one fact table and several dimension tables around it. But you can also have multiple fact tables in your data model. If this is the case, make sure you have shared or conformed dimension tables to slice and dice your fact tables. For example, if you have sales and purchase transaction tables, then you can have dates or products table as shared dimension for your 2 fact tables. When you set up your shared dimension table make sure it has the master list of all dimension members from your fact tables.

Many to one relationship is the recommended for most of your analysis needs because the Vertipaq engine works best with this type of relationship. However, Power BI also supports many to many relationship, but you have to use this with caution. If you don’t understand how it works you should never use it because you may end up having circular reference problem.

Also, a couple of tips and tricks when you set up your data model:

  • hide technical columns such as key columns
  • hide relationship columns from the many side table
  • set data type during ETL in Power Query
  • avoid using implicit measure

Ok, wanna learn more about best practice of Power BI data model, these 2 chapters got you covered.

 

BOOK REVIEW – POWER BI MVP BOOK Chapter 4

Chapter 4 talks about the importance of having one or multiple proper dates table in your data model. If different audience in your organisation wants to slice the data in different ways, for example, sales team needs to look at revenue using a custom 4-4-5 calendar and finance department needs to report on fiscal calendar, then Power Query will enable you to create such tables with ease. If you already have dates table suitable for your analysis in our corporate data warehouse, use it. If you don’t, Power Query comes to rescue.

Additionally, dates table built with Power Query can be scaled automatically with your transaction data, which means you will never have to ask your IT department to expand your corporate dates table. Sounds cool, huh? I really enjoy reading this chapter as I have learned how to build special 4-4-5,4-5-4 and 5-4-4 calendar table. This will come handy if ever need this.

P.S. There will be no review for chapter 5 as it’s introduction to Power Query. Good for reader with no Power Query knowledge at all. In my view this chapter should have been the first chapter.