Skip to content

Latest commit

 

History

History
453 lines (394 loc) · 23.3 KB

README.md

File metadata and controls

453 lines (394 loc) · 23.3 KB

LinkedIn Sisense-open-source


Sisense + GPT-3: A new way to supercharge your analytics

This project demonstrates how to bring the power of large scale language models directly into Sisense Fusion by connecting to GPT-3. We will show how to leverage the unique flexible capabilities of Sisense Fusion by building and running Python code from Sisense to phrase dynamic queries to GPT-3, automatically generating new database tables and relationships in Sisense based on the results from GPT-3.

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. General Utilities
  5. Resources
  6. Contact


About The Project

This project explains how to leverage Sisense's Custom Code Transformation feature together with our extensive API library to connect with generative language models, such as GPT. You'll find here sample code and documentation on how to enrich data from Sisense with GPT, via your own IDE as a developer experience, as well as a low code experience from the Sisense UI.

Folder Structure 🌵

There are two main sections to this project:

  1. Custom Code Notebooks This section showcases sample Custom Code Transfomration IPython notebooks of server side code, that holds the logic to handle textual prompts from the client, extract data from sisense via api, enrich data with GPT and write back to sisense. The notebooks are in accordance with the Custom code Transformation feature that can be accessed through the Sisense UI.
  2. BloX Examples This section holds a full example of BloX client side code that is complimentary to the Custom Code Transformation ipython notebooks and together create a complete app that passes the user prompt from the BloX widget to Custom Code Transformation, where data is enriched by GPT and passed back to BloX.
Folder Structure
├── images                                         ──> images for docs
│    
├── blox_examples
│   ├── action snippets 
│   │   ├── bloxWidgetSummary.json                 ──> BloX action snippet
│   │   ├── bloxDashboardSummary.json              ──> BloX action snippet
│   │   ├── bloxDataSearch.json                    ──> BloX action snippet
│   │   ├── bloxDataConnect.json                   ──> BloX action snippet
│   │   └── bloxDataSuggest.json                   ──> BloX action snippet
│   │
│   ├── images                                     ──> images for BloX widgets
│   │ 
│   ├── plugin
│   │   └── BloxAISummary                          ──> Plugin Code for BloXAISummary
│   │        └──bloxAI_summary.zip
│   │ 
│   └── Readme.md
│   
├── common
│   ├── BloxAI.dash                                ──> Example dashboard including a BloX widget
│   └── AITransformation.sdata                     ──> sdata file of an empty ElastiCube required for the Custom Code Transformation
│
├── custom_code_notebooks
│   ├── DashboardSummary1
│   │   ├──BloxAIFlows-SummarizeDashboard.png
│   │   ├──DashboardSummary.sipynb                 ──> Custom Code Transformation example
│   │   └──DashboardSummary1.ipynb                 ──> Custom Code Transformation import/export file
│   │      
│   ├── DataConnect1
│   │   ├──BloxAIFlows-ConnectRelation.png
│   │   ├──CreateRelations.sipynb                  ──> Custom Code Transformation example
│   │   └──DataConnect1.ipynb                      ──> Custom Code Transformation import/export file
│   │      
│   ├── DataSearch1
│   │   ├──BloxAIFlows-DataSearch.png
│   │   ├──DataSearch.sipynb                       ──> Custom Code Transformation example
│   │   └──DataSearch1.ipynb                       ──> Custom Code Transformation import/export file
│   │      
│   ├── DataSuggest1
│   │   ├──BloxAIFlows-SuggestRelation.png
│   │   ├──DataSuggest1.ipynb                      ──> Custom Code Transformation example
│   │   └──SuggestRelation.sipynb                  ──> Custom Code Transformation import/export file
│   │      
│   ├── WidgetSummary1
│   │   ├──BloxAIFlows-SummarizeWidget.png
│   │   ├──WidgetSummary.sipynb                    ──> Custom Code Transformation example
│   │   └──WidgetSummary1.ipynb                    ──> Custom Code Transformation import/export file
│   │ 
│   ├── utils
│   │   ├── AIIntegration.py                       ──> manage the GPT integrations
│   │   ├── AIQueries.py                           ──> manage the conenction with openAI 
│   │   ├── AIUtils.py                             ──> utils and shared objects
│   │   ├── SisenseAPI.py                          ──> Sisense API connection
│   │   ├── customcode_errors.py                   ──> Python error codes
│   │   └── InferenceQuestionType.py               ──> Infer question type from the prompts
│   │ 
│   ├── __init__.py     
│   └── Readme.md
│
├── test
│   └── AIIntegrationTests.ipynb
│
├── InstallBloxAI.ipynb                            ──> ipython notebook with an installer code for the project
├── InstallBloxAI.sipynb                           ──> Custom Code notebook that can be imported and help install the project directly through Custom Code Transformation   
├── config.json                                    ──> Configuration file for the installer script   
├── installer.py                                   ──> Installer script logic   
├── installer.sh                                   ──> Installer script executer and wrapper
├── requirements.txt
└── README.md

Built With

  • Sisense API -
  • Sisense BloX -
  • Sisense Custom Code Transformations -
  • Python - Python

Prerequisites

  • An active Sisense account and instance
  • An Admin role is needed to enable Blox, Custom Code, and Custom Code Transformation features
  • Users must be Data Designers to control the model
  • Register at OpenAI for your own API key - OpenAI API key
  • To use DataSearch, DataSuggest, and DataConnect functionality, a Live model on Snowflake or Redshift is required ("Read" permission sufficient)
  • To use DashboardSummary and WidgetSummary, you can use any data model
  • Familiarity with Custom Code Transformations

Limitations

  • Supports small scale questions
  • GPT-3 has a limitation of 4K characters in the query size
  • Questions with a large data size are split to multiple smaller questions

(back to top)

Getting Started

Installation

Developing new language model integrations with Sisense can be done in one of the two following ways:

  • The "Remote" option - Via the Jupyter server in your Sisense instance.
  • The "Local" option – Execution via local IDE.

The default behavior is Remote execution. To run it locally, the code changes listed below are required.

Installation for Remote Development

This type of installation depends on your deployment type:

  • On-prem customers should refer here for a terminal based installation
  • Cloud managed customers and on-prem customers who prefer a UI based installation should refer here (Sisense version 23.2 and above)
Terminal Based Installation
  1. Log into Sisense and enable Custom-Code, Custom-Code Transformations and Blox as shown here
  2. Clone this repository.
  3. Edit the configuration file.
{
  "mode": "install/uninstall",
  "ip": "X.X.X.X",
  "sisense_port": "30845",
  "user": "ssh user",
  "password": "XXX",
  "sisense_username": "user@sisense.com",
  "sisense_password": "XXX",
  "lm_api_key": "OpenAI API KEY"
}
  1. Open your terminal and cd into the repository's directory.
  2. Run the following command:
./installer.sh 

Installation for Local Development

  1. Clone this repository.
  2. Install the required libraries using the following command:
pip install -r requirements.txt

Note that you need to uncomment openai library in the requirements file. 3. Extract authentication token from Sisense REST API explained here 4. Uncomment the code and Insert OpenAI API token here 5. Change log file destination path to local repo uncomment here, comment here 6. Set the following environment variables: API_GATEWAY_EXTERNAL_SERVICE_HOST={Sisense instance IP}
API_GATEWAY_EXTERNAL_SERVICE_PORT={Sisense instance port}
Cookie={Sisense authentication token}
(can be done within the notebooks)

Important Tips

  1. When executing a notebook, pay attention to the required changes listed in the section How to run the notebook
  2. You can now run the notebooks in the repo or create a new one
  3. Creating a new notebook requires additional steps in order for it to be recognised by Sisense, please refer to Adding a New Notebook
  4. You should also upload the utils files to the server at http://{Sisense IP}:{Sisense Port}/app/diag/tree/work/storage_notebooks/custom_code_notebooks/notebooks

Extracting the Authentication Token from the Sisense REST API

Using Sisense's API functionality requires an authorization token. You can get a token using an API call to /api/v1/authentication with your credentials. You can also call the API via Sisense REST API Sisense REST API Click Execute and extract the token within the response body.

Adding a New Notebook

In order for your notebook to be recognized by Sisense, you must register it. To do so, a REST API request to create a notebook is required. The payload of the request should be as follows:

{
    "displayName": "Example",
    "description": "",
    "icon": "",
    "timeout": 0,
    "group": "",
    "additionalParameters": [],
    "isTableDiscoveryEnabled": true,
    "columns": [
        {
            "name": "example",
            "id": "example",
            "type": 2
        }
    ],
    "id": "newNotebook"
}

You can now edit this notebook with your code, or simply upload the newly created notebook.

Configuring OpenAI's API Key

If you are using the installer, follow the installation instructions here on how to add your own API key.

If you are using the custom code notebooks directly, you can enter your own OpenAI key AIQueries.py.

(back to top)

Triggering Your Notebook

Custom Code transformations are designed to be used after data queries (JAQL calls) and operate on the data returned. The integration with language models does not require a data query as a prior step, but must compile with the existing design. For that, you can apply the following steps:

  1. Import the AITransformation.sdata into your Sisense system (which is a dummy data model that can be used for queries).
  2. Whenever you want to run your notebook, a REST API request to /datasources/{cubeName}/jaql is required. This request should include the following payload:
{
        "datasource": { "title": "AITransformation" },
        "metadata": [
            {
                "jaql": {
                    "table": "Empty",
                    "column": "Empty",
                    "dim": "[Empty.Empty]",
                    "filter": { "contains": ""},
                    "datatype": "text",
                    "merged": true,
                    "title": "Empty"
                },
                "instanceid": "00128-E54A-D6",
                "field": {
                    "id": "[Empty.Empty]",
                    "index": 0
                }
            }
        ],
        "count": 1,
        "offset": 0,
        "transformData": [
            {
                "transformDataFormat": {
                    "CustomCodeTransformerFormatType": "Json"
                },
                "customCode": {
                    "codePath": "/work/storage_notebooks/custom_code_notebooks/notebooks/{notebook_directory}/{notebook_name}.ipynb",
                    "cellsDisable": [
                        0
                    ],
                    "language": "Python",
                    "serverUrl": "customcode:8888",
                    "mode": "Post",
                    "additionalParameters": "{Any additional parameters your notebook requires}"
                }
            }
        ],
        "isMaskedResult": true,
        "format": "json",
        "widgetType": "fakewidget",
        "by": "widget",
        "dashboard": "{dashboard_name};{model_name}",
        "queryGuid": "UUID4 identifier"
    }
  • Note that you should edit "additionalParameters", "codePath", "dashboard", and "queryGuid" with your data.

(back to top)

Usage

Developing Integrations

You can find complete end-to-end examples in the following notebooks: search, dashboard summary, and widget summary.

AIIntegration.py, AIUtils.py, AIQueries.py, and InferenceQuestionType.py contain multiple functions and classes that you can use to develop new integrations. Some examples will be shown here. The code snippets below assume that the following variables were initialized:

from custom_code_notebooks.AIIntegration import AIIntegration
from custom_code_notebooks.AIUtils import AIUtils
cookie = "Your API token"
identifier = "Unique identifier for logs"
model_name = "Data model name"
add_param = dict() # Custom Code's additional parameters
utils = AIUtils(identifier, model_name, add_param, cookie)
oai = AIIntegration(utils)

Asking Multiple Questions and Storing the Results to a Table

The following example queries the AI model in batch mode for better performance. It then combines the results and stores them to a new table

table = "temp"
num_item_per_split = 8
"""
list of questions, each one containing `num_item_per_split` items at most.
For example:
[' What is the main organ in the body that is affected by each of the following diseases? \n\nBypass\nCardiac Arrest\nChemotheraphy\nChronic Headache\nDiabetes\nEar infenction\nEKG\nEpilepsy\n\nPlease include the request column in the response.\nPlease output as JSON vector.\n',
' What is the main organ in the body that is affected by each of the following diseases? \n\nHypoglycemia\nRadiotheraphy\n\nPlease include the request column in the response.\nPlease output as JSON vector.\n']
"""
questions = list()

unprocessed_answers = oai.ask_ai_in_batch_mode(questions)
abort, df_result = oai.combine_responses_in_batch_mode(questions, unprocessed_answers, num_item_per_split)
if not abort:
  types = oai.get_data_types_for_response(df_result)
  sql_import_query = utils.get_import_query(df_result, model_name, types)
   df_result_keys = df_result.keys()      
  utils.create_table_with_query(model_name, sql_import_query, table, types, df_result_keys)

Building a JAQL

The following example constructs a JAQL from a given question. The way the JAQL is constructed is affected by how the question is interpreted. To control the question interpretation (or add new types of interpretations) refer to the library InferenceQuestionType.py. oai.get_elements uses a few-shot approach combining pre-defined examples along with the given question to extract better results and construct the JAQL accordingly.

requested_data = "[dashboard sample healthcare widget TOP 10 DIAGNOSIS column diagnosis]"
options = [2, 4]
jaql, info, question_type_object = oai.get_elements(model_name,requested_data,options)

General Utilities

The following are some other utilities which you might find useful:

# Clean a response text from GPT and create a dataframe
oai.ai_queries.clean_json_response(result_text)
# Use cache while querying GPT 
oai.ai_queries.ask_ai_with_cache(self, cache_filename, query)
"""Sisense API functionality"""
# Run a JAQL
utils.run_jaql(jaql, model_name)
# Retrive a configuration
utils.get_configuration(key)
# Create a table
utils.create_table_with_query(data_model, sql_import_query, table_name, types, data_df)
# Create an output widget
utils.get_output_widget_jaql(table_name, types, data_df)

(back to top)

Resources

Blogs
https://www.sisense.com/blog/supercharge-analytics-with-chatgpt/

https://www.sisense.com/blog/how-to-make-bi-and-analytics-adoption-simple/

Webinars
https://www.sisense.com/webinars/revolutionizing-analytics-with-gpt/

Sisense Documentation
Blox Overview - https://docs.sisense.com/main/SisenseLinux/overview-sisense-labs.htm
Custom Code Transformation - https://docs.sisense.com/main/SisenseLinux/transforming-query-results-with-python.htm

(back to top)

Contact

For inquiries contact ai-github-help@sisense.com

(back to top)